#
95409e27 |
|
03-Apr-2024 |
Hannes Reinecke <hare@kernel.org> |
nvmet: implement unique discovery NQN Unique discovery NQNs allow to differentiate between discovery services from (typically physically separate) NVMe-oF subsystems. This is required for establishing secured connections as otherwise the credentials won't be unique and the integrity of the connection cannot be guaranteed. This patch adds a configfs attribute 'discovery_nqn' in the 'nvmet' configfs directory to specify the unique discovery NQN. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
ca2b221d |
|
23-Jan-2024 |
Max Gurtovoy <mgurtovoy@nvidia.com> |
nvmet: introduce new max queue size configuration entry Using this port configuration, one will be able to set the maximal queue size to be used for any controller that will be associated to the configured port. The default value stayed 1024 but each transport will be able to set the its own values before enabling the port. Introduce lower limit of 16 for minimal queue depth (same as we use in the host fabrics drivers). Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Israel Rukshin <israelr@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Guixin Liu <kanie@linux.alibaba.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
c82c370d |
|
23-Jan-2024 |
Max Gurtovoy <mgurtovoy@nvidia.com> |
nvmet: set ctrl pi_support cap before initializing cap reg This is a preparation for setting the maximal queue size of a controller that supports PI. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Israel Rukshin <israelr@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
41951f83 |
|
23-Jan-2024 |
Chaitanya Kulkarni <kch@nvidia.com> |
nvmet: add module description to stop warnings Add MODULE_DESCRIPTION() in order to remove warnings & get clean build:- WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/target/nvmet.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/target/nvme-loop.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/target/nvmet-rdma.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/target/nvmet-fc.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/target/nvme-fcloop.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvme/target/nvmet-tcp.o Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
41353fba |
|
18-Jan-2024 |
Guixin Liu <kanie@linux.alibaba.com> |
nvmet: unify aer type enum The host and target use two definition of aer type, unify them into a single one. Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
4ba8b3f7 |
|
12-Dec-2023 |
Guixin Liu <kanie@linux.alibaba.com> |
nvmet: remove cntlid_min and cntlid_max check in nvmet_alloc_ctrl The cntlid_min and cntlid_max are checked in configfs, don't check again in nvmet_alloc_ctrl(). Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
6173a77b |
|
05-Mar-2023 |
Damien Le Moal <damien.lemoal@opensource.wdc.com> |
nvmet: avoid potential UAF in nvmet_req_complete() An nvme target ->queue_response() operation implementation may free the request passed as argument. Such implementation potentially could result in a use after free of the request pointer when percpu_ref_put() is called in nvmet_req_complete(). Avoid such problem by using a local variable to save the sq pointer before calling __nvmet_req_complete(), thus avoiding dereferencing the req pointer after that function call. Fixes: a07b4970f464 ("nvmet: add a generic NVMe target") Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
68c5444c |
|
15-Nov-2022 |
Aleksandr Miloserdov <a.miloserdov@yadro.com> |
nvmet: expose firmware revision to configfs Allow user to set currently active firmware revision Reviewed-by: Konstantin Shelekhin <k.shelekhin@yadro.com> Reviewed-by: Dmitriy Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Aleksandr Miloserdov <a.miloserdov@yadro.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
23855abd |
|
15-Nov-2022 |
Aleksandr Miloserdov <a.miloserdov@yadro.com> |
nvmet: expose IEEE OUI to configfs Allow user to set OUI for the controller vendor. Reviewed-by: Konstantin Shelekhin <k.shelekhin@yadro.com> Reviewed-by: Dmitriy Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Aleksandr Miloserdov <a.miloserdov@yadro.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
fa8f9ac4 |
|
07-Nov-2022 |
Christoph Hellwig <hch@lst.de> |
nvmet: only allocate a single slab for bvecs There is no need to have a separate slab cache for each namespace, and having separate ones creates duplicate debugs file names as well. Fixes: d5eff33ee6f8 ("nvmet: add simple file backed ns support") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
bbf5410b |
|
20-Oct-2022 |
Uros Bizjak <ubizjak@gmail.com> |
nvmet: use try_cmpxchg in nvmet_update_sq_head Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in nvmet_update_sq_head. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. Note that the value from *ptr should be read using READ_ONCE to prevent the compiler from merging, refetching or reordering the read. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
ddd2b8de |
|
28-Sep-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: fix workqueue MEM_RECLAIM flushing dependency The keep alive timer needs to stay on nvmet_wq, and not modified to reschedule on the system_wq. This fixes a warning: ------------[ cut here ]------------ workqueue: WQ_MEM_RECLAIM nvmet-wq:nvmet_rdma_release_queue_work [nvmet_rdma] is flushing !WQ_MEM_RECLAIM events:nvmet_keep_alive_timer [nvmet] WARNING: CPU: 3 PID: 1086 at kernel/workqueue.c:2628 check_flush_dependency+0x16c/0x1e0 Reported-by: Yi Zhang <yi.zhang@redhat.com> Fixes: 8832cf922151 ("nvmet: use a private workqueue instead of the system workqueue") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
1befd944 |
|
20-Sep-2022 |
Christoph Hellwig <hch@lst.de> |
nvmet-auth: don't try to cancel a non-initialized work_struct Currently blktests nvme/002 trips up debugobjects if CONFIG_NVME_AUTH is enabled, but authentication is not on a queue. This is because nvmet_auth_sq_free cancels sq->auth_expired_work unconditionaly, while auth_expired_work is only ever initialized if authentication is enabled for a given controller. Fix this by calling most of what is nvmet_init_auth unconditionally when initializing the SQ, and just do the setting of the result field in the connect command handler. Fixes: db1312dd9548 ("nvmet: implement basic In-Band Authentication") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de>
|
#
6a02a61e |
|
12-Aug-2022 |
Bart Van Assche <bvanassche@acm.org> |
nvmet: fix a use-after-free Fix the following use-after-free complaint triggered by blktests nvme/004: BUG: KASAN: user-memory-access in blk_mq_complete_request_remote+0xac/0x350 Read of size 4 at addr 0000607bd1835943 by task kworker/13:1/460 Workqueue: nvmet-wq nvme_loop_execute_work [nvme_loop] Call Trace: show_stack+0x52/0x58 dump_stack_lvl+0x49/0x5e print_report.cold+0x36/0x1e2 kasan_report+0xb9/0xf0 __asan_load4+0x6b/0x80 blk_mq_complete_request_remote+0xac/0x350 nvme_loop_queue_response+0x1df/0x275 [nvme_loop] __nvmet_req_complete+0x132/0x4f0 [nvmet] nvmet_req_complete+0x15/0x40 [nvmet] nvmet_execute_io_connect+0x18a/0x1f0 [nvmet] nvme_loop_execute_work+0x20/0x30 [nvme_loop] process_one_work+0x56e/0xa70 worker_thread+0x2d1/0x640 kthread+0x183/0x1c0 ret_from_fork+0x1f/0x30 Cc: stable@vger.kernel.org Fixes: a07b4970f464 ("nvmet: add a generic NVMe target") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
db1312dd |
|
27-Jun-2022 |
Hannes Reinecke <hare@suse.de> |
nvmet: implement basic In-Band Authentication Implement NVMe-oF In-Band authentication according to NVMe TPAR 8006. This patch adds three additional configfs entries 'dhchap_key', 'dhchap_ctrl_key', and 'dhchap_hash' to the 'host' configfs directory. The 'dhchap_key' and 'dhchap_ctrl_key' entries need to be in the ASCII format as specified in NVMe Base Specification v2.0 section 8.13.5.8 'Secret representation'. 'dhchap_hash' defaults to 'hmac(sha256)', and can be written to to switch to a different HMAC algorithm. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
6490c9ed |
|
27-Jun-2022 |
Hannes Reinecke <hare@suse.de> |
nvmet: parse fabrics commands on io queues Some fabrics commands can be sent via io queues, so add a new function nvmet_parse_fabrics_io_cmd() and rename the existing nvmet_parse_fabrics_cmd() to nvmet_parse_fabrics_admin_cmd(). Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
34ad6151 |
|
27-Jun-2022 |
Alan Adamson <alan.adamson@oracle.com> |
nvmet: add a clear_ids attribute for passthru targets If the clear_ids attribute is set to true, the EUI/GUID/UUID is cleared for the passthru target. By default, loop targets will set clear_ids to true. This resolves an issue where a connect to a passthru target fails when using a trtype of 'loop' because EUI/GUID/UUID is not unique. Fixes: 2079f41ec6ff ("nvme: check that EUI/GUID/UUID are globally unique") Signed-off-by: Alan Adamson <alan.adamson@oracle.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
63bc732c |
|
17-Mar-2022 |
Colin Ian King <colin.king@intel.com> |
nvmet: remove redundant assignment after left shift The left shift is followed by a re-assignment back to cc_css, the assignment is redundant. Fix this by replacing the "<<=" operator with "<<" instead. This cleans up the clang scan build warning: drivers/nvme/target/core.c:1124:10: warning: Although the value stored to 'cc_css' is used in the enclosing expression, the value is never actually read from 'cc_css' [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
8832cf92 |
|
21-Mar-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: use a private workqueue instead of the system workqueue Any attempt to flush kernel-global WQs has possibility of deadlock so we should simply stop using them, instead introduce nvmet_wq which is the generic nvmet workqueue for work elements that don't explicitly require a dedicated workqueue (by the mere fact that they are using the system_wq). Changes were done using the following replaces: - s/schedule_work(/queue_work(nvmet_wq, /g - s/schedule_delayed_work(/queue_delayed_work(nvmet_wq, /g - s/flush_scheduled_work()/flush_workqueue(nvmet_wq)/g Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
da783733 |
|
15-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
nvmet: move the call to nvmet_ns_changed out of nvmet_ns_revalidate nvmet_ns_changed states via lockdep that the ns->subsys->lock must be held. The only caller of nvmet_ns_changed which does not acquire that lock is nvmet_ns_revalidate. nvmet_ns_revalidate has 3 callers, of which 2 do not acquire that lock: nvmet_execute_identify_cns_cs_ns and nvmet_execute_identify_ns. The other caller nvmet_ns_revalidate_size_store does acquire the lock. Move the call to nvmet_ns_changed from nvmet_ns_revalidate to the callers so that they can perform the correct locking as needed. This issue was found using a static type-based analyser and manually verified. Reported-by: Niels Dossche <dossche.niels@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
|
#
22027a98 |
|
14-Feb-2022 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: replace ida_simple[get|remove] with the simler ida_[alloc|free] ida_simple_[get|remove] are wrappers anyways. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
0c48645a |
|
15-Mar-2022 |
Hannes Reinecke <hare@suse.de> |
nvmet: revert "nvmet: make discovery NQN configurable" Revert commit 626851e9225d ("nvmet: make discovery NQN configurable"); the interface was deemed incorrect and will be replaced with a different one. Fixes: 626851e9225d ("nvmet: make discovery NQN configurable") Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
2953b30b |
|
18-Oct-2021 |
Hannes Reinecke <hare@suse.de> |
nvmet: register discovery subsystem as 'current' Register the discovery subsystem as the 'current' discovery subsystem, and add a new discovery log page entry for it. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
a294711e |
|
22-Sep-2021 |
Hannes Reinecke <hare@suse.de> |
nvmet: add nvmet_is_disc_subsys() helper Add a helper function to determine if a given subsystem is a discovery subsystem. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
626851e9 |
|
22-Sep-2021 |
Hannes Reinecke <hare@suse.de> |
nvmet: make discovery NQN configurable TPAR8013 allows for unique discovery NQNs, so make the discovery controller NQN configurable by exposing a subsys attribute 'discovery_nqn'. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
6d1555cc |
|
22-Sep-2021 |
Max Gurtovoy <mgurtovoy@nvidia.com> |
nvmet: add get_max_queue_size op for controllers Some transports, such as RDMA, would like to set the queue size according to device/port/ctrl characteristics. Add a new nvmet transport op that is called during ctrl initialization. This will not effect transports that don't implement this option. Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
ab7a2737 |
|
27-Aug-2021 |
Christoph Hellwig <hch@lst.de> |
nvmet: return bool from nvmet_passthru_ctrl and nvmet_is_passthru_req The target core code never needs the host-side nvme_ctrl structure. Open code two uses of nvmet_is_passthru_req in passthru.c, and then switch the helpers used by the core to return bool. Also rename the fuctions to better match their usage: nvmet_passthru_ctrl -> nvmet_is_passthru_subsys nvmet_req_passthru_ctrl -> nvmet_is_passthru_req Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
|
#
77d651a6 |
|
26-Aug-2021 |
Adam Manzanares <a.manzanares@samsung.com> |
nvmet: looks at the passthrough controller when initializing CAP For a passthru controller make cap initialization dependent on the cap of the passthru controller, given that multiple Command Set support needs to be supported by the underlying controller. For that move the initialization of CAP later so that it can use the fully initialized nvmet_ctrl structure. Fixes: ab5d0b38c047 (nvmet: add Command Set Identifier support) Signed-off-by: Adam Manzanares <a.manzanares@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> [hch: refactored the code a bit to keep it more contained in passthru.c] Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
b71df126 |
|
08-Aug-2021 |
Amit Engel <amit.engel@dell.com> |
nvmet: avoid duplicate qid in connect cmd According to the NVMe specification, if the host sends a Connect command specifying a queue id which has already been created, a status value of NVME_SC_CMD_SEQ_ERROR is returned. Signed-off-by: Amit Engel <amit.engel@dell.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
aaf2e048 |
|
09-Jun-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add ZBD over ZNS backend support NVMe TP 4053 – Zoned Namespaces (ZNS) allows host software to communicate with a non-volatile memory subsystem using zones for NVMe protocol-based controllers. NVMeOF already support the ZNS NVMe Protocol compliant devices on the target in the passthru mode. There are generic zoned block devices like Shingled Magnetic Recording (SMR) HDDs that are not based on the NVMe protocol. This patch adds ZNS backend support for non-ZNS zoned block devices as NVMeOF targets. This support includes implementing the new command set NVME_CSI_ZNS, adding different command handlers for ZNS command set such as NVMe Identify Controller, NVMe Identify Namespace, NVMe Zone Append, NVMe Zone Management Send and NVMe Zone Management Receive. With the new command set identifier, we also update the target command effects logs to reflect the ZNS compliant commands. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
ab5d0b38 |
|
09-Jun-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add Command Set Identifier support NVMe TP 4056 allows controllers to support different command sets. NVMeoF target currently only supports namespaces that contain traditional logical blocks that may be randomly read and written. In some applications there is a value in exposing namespaces that contain logical blocks that have special access rules (e.g. sequentially write required namespace such as Zoned Namespace (ZNS)). In order to support the Zoned Block Devices (ZBD) backend, controllers need to have support for ZNS Command Set Identifier (CSI). In this preparation patch, we adjust the code such that it can now support the default command set identifier. We update the namespace data structure to store the CSI value which defaults to NVME_CSI_NVM that represents traditional logical blocks namespace type. The CSI support is required to implement the ZBD backend for NVMeOF with host side NVMe ZNS interface, since ZNS commands belong to the different command set than the default one. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
7860569a |
|
13-Jun-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: remove local variable In function errno_to_nvme_status() we store the value of the NVMe status into the local variable and don't do anything useful with that but just return. Remove the local variable and return the value directly from switch. This also removed extra break statements. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
8bb6cb9b |
|
13-Jun-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: use nvme status value directly There is no point in keeping the status variable that is used only once in the function nvmet_async_events_failall(). Remove the variable and use the value directly. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
245067e3 |
|
13-Jun-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: use u32 type for the local variable nsid In function nvmet_max_nsid() we calculate the max nsid by iterating over the XArray and store it in the variable nsid that has type of unsigned long. Since the value of this function is stored into the subsys->max_nsid which is of type u32, change the local variable nsid type and the return type of the same function to u32. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
0d148efd |
|
06-Jun-2021 |
Noam Gottlieb <ngottlieb@nvidia.com> |
nvmet: allow mn change if subsys not discovered Currently, once the subsystem's model_number is set for the first time there is no way to change it. However, as long as no connection was established to nvmf target, there is no reason for such restriction and we should allow to change the subsystem's model_number as many times as needed. In addition, in order to simplfy the changes and make the model number flow more similar to the rest of the attributes in the Identify Controller data structure, we set a default value for the model number at the initiation of the subsystem. Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Noam Gottlieb <ngottlieb@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
e13b0615 |
|
06-Jun-2021 |
Noam Gottlieb <ngottlieb@nvidia.com> |
nvmet: change sn size and check validity According to the NVM specification, the serial_number should be 20 bytes (bytes 23:04 of the Identify Controller data structure), and should contain only ASCII characters. In accordance, the serial_number size is changed to 20 bytes and before any attempt to store a new value in serial_number we check that the input is valid - i.e. contains only ASCII characters, is not empty and does not exceed 20 bytes. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Noam Gottlieb <ngottlieb@nvidia.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
f6e8bd59 |
|
22-Apr-2021 |
Amit Engel <amit.engel@dell.com> |
nvmet: move ka_work initialization to nvmet_alloc_ctrl Initialize keep-alive work only once, as part of alloc_ctrl and not each time that nvmet_start_keep_alive_timer is being called Signed-off-by: Amit Engel <amit.engel@dell.com> Reviewed-by: Hou Pu <houpu.main@gmail.com>
|
#
bcd9a079 |
|
01-Jun-2021 |
Max Gurtovoy <mgurtovoy@nvidia.com> |
nvmet: fix freeing unallocated p2pmem In case p2p device was found but the p2p pool is empty, the nvme target is still trying to free the sgl from the p2p pool instead of the regular sgl pool and causing a crash (BUG() is called). Instead, assign the p2p_dev for the request only if it was allocated from p2p pool. This is the crash that was caused: [Sun May 30 19:13:53 2021] ------------[ cut here ]------------ [Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518! [Sun May 30 19:13:53 2021] invalid opcode: 0000 [#1] SMP PTI ... [Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518! ... [Sun May 30 19:13:53 2021] RIP: 0010:gen_pool_free_owner+0xa8/0xb0 ... [Sun May 30 19:13:53 2021] Call Trace: [Sun May 30 19:13:53 2021] ------------[ cut here ]------------ [Sun May 30 19:13:53 2021] pci_free_p2pmem+0x2b/0x70 [Sun May 30 19:13:53 2021] pci_p2pmem_free_sgl+0x4f/0x80 [Sun May 30 19:13:53 2021] nvmet_req_free_sgls+0x1e/0x80 [nvmet] [Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518! [Sun May 30 19:13:53 2021] nvmet_rdma_release_rsp+0x4e/0x1f0 [nvmet_rdma] [Sun May 30 19:13:53 2021] nvmet_rdma_send_done+0x1c/0x60 [nvmet_rdma] Fixes: c6e3f1339812 ("nvmet: add metadata support for block devices") Reviewed-by: Israel Rukshin <israelr@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
aaeadd70 |
|
25-May-2021 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: fix false keep-alive timeout when a controller is torn down Controller teardown flow may take some time in case it has many I/O queues, and the host may not send us keep-alive during this period. Hence reset the traffic based keep-alive timer so we don't trigger a controller teardown as a result of a keep-alive expiration. Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
fec356a6 |
|
18-May-2021 |
Wu Bo <wubo40@huawei.com> |
nvmet: fix memory leak in nvmet_alloc_ctrl() When creating ctrl in nvmet_alloc_ctrl(), if the cntlid_min is larger than cntlid_max of the subsystem, and jumps to the "out_free_changed_ns_list" label, but the ctrl->sqs lack of be freed. Fix this by jumping to the "out_free_sqs" label. Fixes: 94a39d61f80f ("nvmet: make ctrl-id configurable") Signed-off-by: Wu Bo <wubo40@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
de587804 |
|
09-Mar-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: remove unnecessary ctrl parameter The function nvmet_ctrl_find_get() accepts out pointer to nvmet_ctrl structure. This function returns the same error value from two places that is :- NVME_SC_CONNECT_INVALID_PARAM | NVME_SC_DNR. Move this to the caller so we can change the return type to nvmet_ctrl. Now that we can changed the return type, instead of taking out pointer to the nvmet_ctrl structure remove that function parameter and return the valid nvmet_ctrl pointer on success and NULL on failure. Also, add and rename the goto labels for more readability with comments. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
7798df6f |
|
24-Feb-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: remove an unnecessary function parameter to nvmet_check_ctrl_status In nvmet_check_ctrl_status() cmd can be derived from nvmet_req. Remove the local variable cmd in the nvmet_check_ctrl_status() and function parameter cmd for nvmet_check_ctrl_status(). Derive the cmd value from req parameter in the nvmet_check_ctrl_status(). Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
a56f14c2 |
|
24-Feb-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: update error log page in nvmet_alloc_ctrl() Instead of updating the error log page in the caller of the nvmet_alloc_ctrt() update the error log page in the nvmet_alloc_ctrl(). Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
76affbe6 |
|
24-Feb-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: remove a duplicate status assignment in nvmet_alloc_ctrl In the function nvmet_alloc_ctrl() we assign status value before we call nvmet_fine_get_subsys() to: status = NVME_SC_CONNECT_INVALID_PARAM | NVME_SC_DNR; After we successfully find the subsystem we again set the status value to: status = NVME_SC_CONNECT_INVALID_PARAM | NVME_SC_DNR; Remove the duplicate status assignment value. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
d218a8a3 |
|
15-Mar-2021 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: don't check iosqes,iocqes for discovery controllers From the base spec, Figure 78: "Controller Configuration, these fields are defined as parameters to configure an "I/O Controller (IOC)" and not to configure a "Discovery Controller (DC). ... If the controller does not support I/O queues, then this field shall be read-only with a value of 0h Just perform this check for I/O controllers. Fixes: a07b4970f464 ("nvmet: add a generic NVMe target") Reported-by: Belanger, Martin <Martin.Belanger@dell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
d9f273b7 |
|
17-Feb-2021 |
Max Gurtovoy <mgurtovoy@nvidia.com> |
nvmet: model_number must be immutable once set In case we have already established connection to nvmf target, it shouldn't be allowed to change the model_number. E.g. if someone will identify ctrl and get model_number of "my_model" later on will change the model_numbel via configfs to "my_new_model" this will break the NVMe specification for "Get Log Page – Persistent Event Log" that refers to Model Number as: "This field contains the same value as reported in the Model Number field of the Identify Controller data structure, bytes 63:24." Although it doesn't mentioned explicitly that this field can't be changed, we can assume it. So allow setting this field only once: using configfs or in the first identify ctrl operation. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
295a39f5 |
|
09-Feb-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: remove else at the end of the function The function nvmet_parse_io_cmd() returns value from nvmet_file_parse_io_cmd() or nvmet_bdev_parse_io_cmd() based on which backend is set for the request. Remove the else and just return the value from nvmet_bdev_parse_io_cmd(). Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
20c2c3bb |
|
09-Feb-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add nvmet_req_subsys() helper Just like what we have to get the passthru ctrl from the req, add an helper to get the subsystem associated with the nvmet_req() instead of open coding the chain of structures. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
d81d57cf |
|
09-Feb-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add helper to report invalid opcode In the NVMeOF block device backend, file backend, and passthru backend we reject and report the commands if opcode is not handled. Add an helper and use it in block device backend to keep the code and error message uniform. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
3a1f7c79 |
|
09-Feb-2021 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: make nvmet_find_namespace() req based The six callers of nvmet_find_namespace() duplicate the error log page update and status setting code for each call on failure. All callers are nvmet requests based functions, so we can pass req to the nvmet_find_namesapce() & derive ctrl from req, that'll allow us to update the error log page in nvmet_find_namespace(). Now that we pass the request we can also get rid of the local variable in nvmet_find_namespace() and use the req->ns and return the error code. Replace the ctrl parameter with nvmet_req for nvmet_find_namespace(), centralize the error log page update for non allocated namesapces, and return uniform error for non-allocated namespace. The nvmet_find_namespace() takes nsid parameter which is from NVMe commands structures such as get_log_page, identify, rw and common. All these commands have same offset for the nsid field. Derive nsid from req->cmd->common.nsid) & remove the extra parameter from the nvmet_find_namespace(). Lastly now we associate the ns to the req parameter that we pass to the nvmet_find_namespace(), rename nvmet_find_namespace() to nvmet_req_find_ns(). Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
6d65aeab |
|
15-Nov-2020 |
Amit <amit.engel@dell.com> |
nvmet: remove unused ctrl->cqs remove unused cqs from nvmet_ctrl struct this will reduce the allocated memory. Signed-off-by: Amit <amit.engel@dell.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
3c3751f2 |
|
22-Oct-2020 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: fix a NULL pointer dereference when tracing the flush command When target side trace in turned on and flush command is issued from the host it results in the following Oops. [ 856.789724] BUG: kernel NULL pointer dereference, address: 0000000000000068 [ 856.790686] #PF: supervisor read access in kernel mode [ 856.791262] #PF: error_code(0x0000) - not-present page [ 856.791863] PGD 6d7110067 P4D 6d7110067 PUD 66f0ad067 PMD 0 [ 856.792527] Oops: 0000 [#1] SMP NOPTI [ 856.792950] CPU: 15 PID: 7034 Comm: nvme Tainted: G OE 5.9.0nvme-5.9+ #71 [ 856.793790] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e3214 [ 856.794956] RIP: 0010:trace_event_raw_event_nvmet_req_init+0x13e/0x170 [nvmet] [ 856.795734] Code: 41 5c 41 5d c3 31 d2 31 f6 e8 4e 9b b8 e0 e9 0e ff ff ff 49 8b 55 00 48 8b 38 8b 0 [ 856.797740] RSP: 0018:ffffc90001be3a60 EFLAGS: 00010246 [ 856.798375] RAX: 0000000000000000 RBX: ffff8887e7d2c01c RCX: 0000000000000000 [ 856.799234] RDX: 0000000000000020 RSI: 0000000057e70ea2 RDI: ffff8887e7d2c034 [ 856.800088] RBP: ffff88869f710578 R08: ffff888807500d40 R09: 00000000fffffffe [ 856.800951] R10: 0000000064c66670 R11: 00000000ef955201 R12: ffff8887e7d2c034 [ 856.801807] R13: ffff88869f7105c8 R14: 0000000000000040 R15: ffff88869f710440 [ 856.802667] FS: 00007f6a22bd8780(0000) GS:ffff888813a00000(0000) knlGS:0000000000000000 [ 856.803635] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 856.804367] CR2: 0000000000000068 CR3: 00000006d73e0000 CR4: 00000000003506e0 [ 856.805283] Call Trace: [ 856.805613] nvmet_req_init+0x27c/0x480 [nvmet] [ 856.806200] nvme_loop_queue_rq+0xcb/0x1d0 [nvme_loop] [ 856.806862] blk_mq_dispatch_rq_list+0x123/0x7b0 [ 856.807459] ? kvm_sched_clock_read+0x14/0x30 [ 856.808025] __blk_mq_sched_dispatch_requests+0xc7/0x170 [ 856.808708] blk_mq_sched_dispatch_requests+0x30/0x60 [ 856.809372] __blk_mq_run_hw_queue+0x70/0x100 [ 856.809935] __blk_mq_delay_run_hw_queue+0x156/0x170 [ 856.810574] blk_mq_run_hw_queue+0x86/0xe0 [ 856.811104] blk_mq_sched_insert_request+0xef/0x160 [ 856.811733] blk_execute_rq+0x69/0xc0 [ 856.812212] ? blk_mq_rq_ctx_init+0xd0/0x230 [ 856.812784] nvme_execute_passthru_rq+0x57/0x130 [nvme_core] [ 856.813461] nvme_submit_user_cmd+0xeb/0x300 [nvme_core] [ 856.814099] nvme_user_cmd.isra.82+0x11e/0x1a0 [nvme_core] [ 856.814752] blkdev_ioctl+0x1dc/0x2c0 [ 856.815197] block_ioctl+0x3f/0x50 [ 856.815606] __x64_sys_ioctl+0x84/0xc0 [ 856.816074] do_syscall_64+0x33/0x40 [ 856.816533] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 856.817168] RIP: 0033:0x7f6a222ed107 [ 856.817617] Code: 44 00 00 48 8b 05 81 cd 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 8 [ 856.819901] RSP: 002b:00007ffca848f058 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 856.820846] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f6a222ed107 [ 856.821726] RDX: 00007ffca848f060 RSI: 00000000c0484e43 RDI: 0000000000000003 [ 856.822603] RBP: 0000000000000003 R08: 000000000000003f R09: 0000000000000005 [ 856.823478] R10: 00007ffca848ece0 R11: 0000000000000202 R12: 00007ffca84912d3 [ 856.824359] R13: 00007ffca848f4d0 R14: 0000000000000002 R15: 000000000067e900 [ 856.825236] Modules linked in: nvme_loop(OE) nvmet(OE) nvme_fabrics(OE) null_blk nvme(OE) nvme_corel Move the nvmet_req_init() tracepoint after we parse the command in nvmet_req_init() so that we can get rid of the duplicate nvmet_find_namespace() call. Rename __assign_disk_name() -> __assign_req_name(). Now that we call tracepoint after parsing the command simplify the newly added __assign_req_name() which fixes this bug. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
85bd23f3 |
|
14-Oct-2020 |
zhenwei pi <pizhenwei@bytedance.com> |
nvmet: fix uninitialized work for zero kato When connecting a controller with a zero kato value using the following command line nvme connect -t tcp -n NQN -a ADDR -s PORT --keep-alive-tmo=0 the warning below can be reproduced: WARNING: CPU: 1 PID: 241 at kernel/workqueue.c:1627 __queue_delayed_work+0x6d/0x90 with trace: mod_delayed_work_on+0x59/0x90 nvmet_update_cc+0xee/0x100 [nvmet] nvmet_execute_prop_set+0x72/0x80 [nvmet] nvmet_tcp_try_recv_pdu+0x2f7/0x770 [nvmet_tcp] nvmet_tcp_io_work+0x63f/0xb2d [nvmet_tcp] ... This is caused by queuing up an uninitialized work. Althrough the keep-alive timer is disabled during allocating the controller (fixed in 0d3b6a8d213a), ka_work still has a chance to run (called by nvmet_start_ctrl). Fixes: 0d3b6a8d213a ("nvmet: Disable keep-alive timer when kato is cleared to 0h") Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
4e683c48 |
|
16-Sep-2020 |
Amit Engel <amit.engel@dell.com> |
nvmet: handle keep-alive timer when kato is modified by a set features cmd A user may modify the kato by a set features cmd. To properly deal with races or a kato value of 0 (no keep alive enabled) change nvmet_set_feat_kato to first disable the timer, then set the value and then re-enable the timer. Signed-off-by: Amit Engel <amit.engel@dell.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
df561f66 |
|
23-Aug-2020 |
Gustavo A. R. Silva <gustavoars@kernel.org> |
treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
|
#
0d3b6a8d |
|
19-Aug-2020 |
Amit Engel <amit.engel@dell.com> |
nvmet: Disable keep-alive timer when kato is cleared to 0h Based on nvme spec, when keep alive timeout is set to zero the keep-alive timer should be disabled. Signed-off-by: Amit Engel <amit.engel@dell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
ba76af67 |
|
24-Jul-2020 |
Logan Gunthorpe <logang@deltatee.com> |
nvmet: Add passthru enable/disable helpers This patch adds helper functions which are used in the NVMeOF configfs when the user is configuring the passthru subsystem. Here we ensure that only one subsys is assigned to each nvme_ctrl by using an xarray on the cntlid. The subsystem's version number is overridden by the passed through controller's version. However, if that version is less than 1.2.1, then we bump the advertised version to that and print a warning in dmesg. Based-on-a-patch-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
c1fef73f |
|
24-Jul-2020 |
Logan Gunthorpe <logang@deltatee.com> |
nvmet: add passthru code to process commands Add passthru command handling capability for the NVMeOF target and export passthru APIs which are used to integrate passthru code with nvmet-core. The new file passthru.c handles passthru cmd parsing and execution. In the passthru mode, we create a block layer request from the nvmet request and map the data on to the block layer request. Admin commands and features are on an allow list as there are a number of each that don't make too much sense with passthrough. We use an allow list such that new commands can be considered before being blindly passed through. In both cases, vendor specific commands are always allowed. We also reject reservation IO commands as the underlying device cannot differentiate between multiple hosts behind a fabric. Based-on-a-patch-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
7774e77e |
|
19-Jul-2020 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: use xarray for ctrl ns storing This patch replaces the ctrl->namespaces tracking from linked list to xarray and improves the performance when accessing one namespce :- XArray vs Default:- IOPS and BW (more the better) increase BW (~1.8%):- --------------------------------------------------- XArray :- read: IOPS=160k, BW=626MiB/s (656MB/s)(18.3GiB/30001msec) read: IOPS=160k, BW=626MiB/s (656MB/s)(18.3GiB/30001msec) read: IOPS=162k, BW=631MiB/s (662MB/s)(18.5GiB/30001msec) Default:- read: IOPS=156k, BW=609MiB/s (639MB/s)(17.8GiB/30001msec) read: IOPS=157k, BW=613MiB/s (643MB/s)(17.0GiB/30001msec) read: IOPS=160k, BW=626MiB/s (656MB/s)(18.3GiB/30001msec) Submission latency (less the better) decrease (~8.3%):- ------------------------------------------------------- XArray:- slat (usec): min=7, max=8386, avg=11.19, stdev=5.96 slat (usec): min=7, max=441, avg=11.09, stdev=4.48 slat (usec): min=7, max=1088, avg=11.21, stdev=4.54 Default :- slat (usec): min=8, max=2826.5k, avg=23.96, stdev=3911.50 slat (usec): min=8, max=503, avg=12.52, stdev=5.07 slat (usec): min=8, max=2384, avg=12.50, stdev=5.28 CPU Usage (less the better) decrease (~5.2%):- ---------------------------------------------- XArray:- cpu : usr=1.84%, sys=18.61%, ctx=949471, majf=0, minf=250 cpu : usr=1.83%, sys=18.41%, ctx=950262, majf=0, minf=237 cpu : usr=1.82%, sys=18.82%, ctx=957224, majf=0, minf=234 Default:- cpu : usr=1.70%, sys=19.21%, ctx=858196, majf=0, minf=251 cpu : usr=1.82%, sys=19.98%, ctx=929720, majf=0, minf=227 cpu : usr=1.83%, sys=20.33%, ctx=947208, majf=0, minf=235. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
6fa350f7 |
|
02-Jun-2020 |
Max Gurtovoy <maxg@mellanox.com> |
nvmet: introduce flags member in nvmet_fabrics_ops Replace has_keyed_sgls and metadata_support booleans with a flags member that will be used for adding more features in the future. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
e556f6ba |
|
26-Jun-2020 |
Christoph Hellwig <hch@lst.de> |
block: remove the bd_queue field from struct block_device Just use bd_disk->queue instead. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
819f7b88 |
|
09-Jun-2020 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: fail outstanding host posted AEN req In function nvmet_async_event_process() we only process AENs iff there is an open slot on the ctrl->async_event_cmds[] && aen event list posted by the target is not empty. This keeps host posted AEN outstanding if target generated AEN list is empty. We do cleanup the target generated entries from the aen list in nvmet_ctrl_free()-> nvmet_async_events_free() but we don't process AEN posted by the host. This leads to following problem :- When processing admin sq at the time of nvmet_sq_destroy() holds an extra percpu reference(atomic value = 1), so in the following code path after switching to atomic rcu, release function (nvmet_sq_free()) is not getting called which blocks the sq->free_done in nvmet_sq_destroy() :- nvmet_sq_destroy() percpu_ref_kill_and_confirm() - __percpu_ref_switch_mode() -- __percpu_ref_switch_to_atomic() --- call_rcu() -> percpu_ref_switch_to_atomic_rcu() ---- /* calls switch callback */ - percpu_ref_put() -- percpu_ref_put_many(ref, 1) --- else if (unlikely(atomic_long_sub_and_test(nr, &ref->count))) ---- ref->release(ref); <---- Not called. This results in indefinite hang:- void nvmet_sq_destroy(struct nvmet_sq *sq) ... if (ctrl && ctrl->sqs && ctrl->sqs[0] == sq) { nvmet_async_events_process(ctrl, status); percpu_ref_put(&sq->ref); } percpu_ref_kill_and_confirm(&sq->ref, nvmet_confirm_sq); wait_for_completion(&sq->confirm_done); wait_for_completion(&sq->free_done); <-- Hang here Which breaks the further disconnect sequence. This problem seems to be introduced after commit 64f5e9cdd711b ("nvmet: fix memory leak when removing namespaces and controllers concurrently"). This patch processes ctrl->async_event_cmds[] in the admin sq destroy() context irrespetive of aen_list. Also we get rid of the controller's aen_list processing in the nvmet_sq_destroy() context and just ignore ctrl->aen_list. This results in nvmet_async_events_process() being called from workqueue context so we adjust the code accordingly. Fixes: 64f5e9cdd711 ("nvmet: fix memory leak when removing namespaces and controllers concurrently ") Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
1cdf9f76 |
|
18-May-2020 |
David Milburn <dmilburn@redhat.com> |
nvmet: cleanups the loop in nvmet_async_events_process Based-on-a-patch-by: Christoph Hellwig <hch@infradead.org> Tested-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: David Milburn <dmilburn@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
64f5e9cd |
|
20-May-2020 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: fix memory leak when removing namespaces and controllers concurrently When removing a namespace, we add an NS_CHANGE async event, however if the controller admin queue is removed after the event was added but not yet processed, we won't free the aens, resulting in the below memory leak [1]. Fix that by moving nvmet_async_event_free to the final controller release after it is detached from subsys->ctrls ensuring no async events are added, and modify it to simply remove all pending aens. -- $ cat /sys/kernel/debug/kmemleak unreferenced object 0xffff888c1af2c000 (size 32): comm "nvmetcli", pid 5164, jiffies 4295220864 (age 6829.924s) hex dump (first 32 bytes): 28 01 82 3b 8b 88 ff ff 28 01 82 3b 8b 88 ff ff (..;....(..;.... 02 00 04 65 76 65 6e 74 5f 66 69 6c 65 00 00 00 ...event_file... backtrace: [<00000000217ae580>] nvmet_add_async_event+0x57/0x290 [nvmet] [<0000000012aa2ea9>] nvmet_ns_changed+0x206/0x300 [nvmet] [<00000000bb3fd52e>] nvmet_ns_disable+0x367/0x4f0 [nvmet] [<00000000e91ca9ec>] nvmet_ns_free+0x15/0x180 [nvmet] [<00000000a15deb52>] config_item_release+0xf1/0x1c0 [<000000007e148432>] configfs_rmdir+0x555/0x7c0 [<00000000f4506ea6>] vfs_rmdir+0x142/0x3c0 [<0000000000acaaf0>] do_rmdir+0x2b2/0x340 [<0000000034d1aa52>] do_syscall_64+0xa5/0x4d0 [<00000000211f13bc>] entry_SYSCALL_64_after_hwframe+0x6a/0xdf Fixes: a07b4970f464 ("nvmet: add a generic NVMe target") Reported-by: David Milburn <dmilburn@redhat.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: David Milburn <dmilburn@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
c6e3f133 |
|
19-May-2020 |
Israel Rukshin <israelr@mellanox.com> |
nvmet: add metadata support for block devices Allocate the metadata SGL buffers and set metadata fields for the request. Then create a block IO request for the metadata from the protection SG list. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
ea52ac1c |
|
19-May-2020 |
Israel Rukshin <israelr@mellanox.com> |
nvmet: add metadata/T10-PI support Expose the namespace metadata format when PI is enabled. The user needs to enable the capability per subsystem and per port. The other metadata properties are taken from the namespace/bdev. Usage example: echo 1 > /config/nvmet/subsystems/${NAME}/attr_pi_enable echo 1 > /config/nvmet/ports/${PORT_NUM}/param_pi_enable Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
136cc1ff |
|
19-May-2020 |
Israel Rukshin <israelr@mellanox.com> |
nvmet: rename nvmet_check_data_len to nvmet_check_transfer_len The function doesn't check only the data length, because the transfer length includes also the metadata length in some cases. This is preparation for adding metadata (T10-PI) support. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
de124f42 |
|
19-May-2020 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: generate AEN for ns revalidate size change The newly added function nvmet_ns_revalidate() does update the ns size in the identify namespace in-core target data structure when host issues id-ns command. This can lead to host having inconsistencies between size of the namespace present in the id-ns command result and size of the corresponding block device until host scans the namespaces explicitly. To avoid this scenario generate AEN if old size is not same as the new one in nvmet_ns_revalidate(). This will allow automatic AEN generation when host calls id-ns command and also allows target to install userspace rules so that it can trigger nvmet_ns_revalidate() (using configfs interface with the help of next patch) resulting in appropriate AEN generation when underlying namespace size change is detected. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
463c5fab |
|
19-May-2020 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add helper to revalidate bdev and file ns This patch adds a wrapper helper to indicate size change in the bdev & file-backed namespace when revalidating ns. This helper is needed in order to minimize code repetition in the next patch for configfs.c and existing admin-cmd.c. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
696ece75 |
|
19-May-2020 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add async event tracing support This adds a new tracepoint for the target to trace async event. This is helpful in debugging and comparing host and target side async events especially when host is connected to different targets on different machines and now that we rely on userspace components to generate AEN. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
013b7ebe |
|
30-Jan-2020 |
Mark Ruijter <MRuijter@onestopsystems.com> |
nvmet: make ctrl model configurable This patch adds a new target subsys attribute which allows user to optionally specify model name which then used in the nvmet_execute_identify_ctrl() to fill up the nvme_id_ctrl structure. The default value for the model is set to "Linux" for backward compatibility. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Mark Ruijter <MRuijter@onestopsystems.com> [chaitanya.kulkarni@wdc.com *Use macro for default model, coding style fixes. *Use RCU for accessing model in for configfs and in nvmet_execute_identify_ctrl(). ] Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
94a39d61 |
|
30-Jan-2020 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: make ctrl-id configurable This patch adds a new target subsys attribute which allows user to optionally specify target controller IDs which then used in the nvmet_execute_identify_ctrl() to fill up the nvme_id_ctrl structure. For example, when using a cluster setup with two nodes, with a dual ported NVMe drive and exporting the drive from both the nodes, The connection to the host fails due to the same controller ID and results in the following error message:- "nvme nvmeX: Duplicate cntlid XXX with nvmeX, rejecting" With this patch now user can partition the controller IDs for each subsystem by setting up the cntlid_min and cntlid_max. These values will be used at the time of the controller ID creation. By partitioning the ctrl-ids for each subsystem results in the unique ctrl-id space which avoids the collision. When new attribute is not specified target will fall back to original cntlid calculation method. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
0f5be6a4 |
|
30-Jan-2020 |
Daniel Wagner <dwagner@suse.de> |
nvmet: update AEN list and array at one place All async events are enqueued via nvmet_add_async_event() which updates the ctrl->async_event_cmds[] array and additionally an struct nvmet_async_event is added to the ctrl->async_events list. Under normal operations the nvmet_async_event_work() updates again the ctrl->async_event_cmds and removes the corresponding struct nvmet_async_event from the list again. Though nvmet_sq_destroy() could be called which calls nvmet_async_events_free() which only updates the ctrl->async_event_cmds[] array. Add new functions nvmet_async_events_process() and nvmet_async_events_free() to process async events, update an array and the list. When we destroy submission queue after clearing the aen present on the ctrl->async list we also loop over ctrl->async_event_cmds[] for any requests posted by the host for which we don't have the AEN in the ctrl->async_events list by calling nvmet_async_event_process() and nvmet_async_events_free(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> [chaitanya.kulkarni@wdc.com * Loop over and clear out outstanding requests * Update changelog ] Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
b716e688 |
|
27-Jan-2020 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: fix dsm failure when payload does not match sgl descriptor The host is allowed to pass the controller an sgl describing a buffer that is larger than the dsm payload itself, allow it when executing dsm. Reported-by: Dakshaja Uppalapati <dakshaja@chelsio.com> Reviewed-by: Christoph Hellwig <hch@lst.de>, Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
4ac76436 |
|
11-Jan-2020 |
Amol Grover <frextrite@gmail.com> |
nvmet: Pass lockdep expression to RCU lists ctrl->subsys->namespaces and subsys->namespaces are traversed with list_for_each_entry_rcu outside an RCU read-side critical section but under the protection of ctrl->subsys->lock and subsys->lock respectively. Hence, add the corresponding lockdep expression to the list traversal primitive to silence false-positive lockdep warnings, and harden RCU lists. Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
|
#
d84dd8cd |
|
25-Oct-2019 |
Christoph Hellwig <hch@lst.de> |
nvmet: clean up command parsing a bit Move the special cases for fabrics commands and the discovery controller to nvmet_parse_admin_cmd in preparation for adding passthrough support. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
be3f3114 |
|
23-Oct-2019 |
Christoph Hellwig <hch@lst.de> |
nvmet: Open code nvmet_req_execute() Now that nvmet_req_execute does nothing, open code it. Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [split patch, update changelog] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
e9061c39 |
|
23-Oct-2019 |
Christoph Hellwig <hch@lst.de> |
nvmet: Remove the data_len field from the nvmet_req struct Instead of storing the expected length and checking it when it's executed, just check the length inside the command themselves. A new helper, nvmet_check_data_len() is created to help with this check. Signed-off-by: Christoph Hellwig <hch@lst.de> [split patch, udpate changelog] Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
e522f446 |
|
13-Oct-2019 |
Israel Rukshin <israelr@mellanox.com> |
nvmet: add unlikely check at nvmet_req_alloc_sgl The call to sgl_alloc shouldn't fail so add this simple optimization to the fast path. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
cfc1a1af |
|
31-Jul-2019 |
Logan Gunthorpe <logang@deltatee.com> |
nvmet-file: fix nvmet_file_flush() always returning an error Presently, nvmet_file_flush() always returns a call to errno_to_nvme_status() but that helper doesn't take into account the case when errno=0. So nvmet_file_flush() always returns an error code. All other callers of errno_to_nvme_status() check for success before calling it. To fix this, ensure errno_to_nvme_status() returns success if the errno is zero. This should prevent future mistakes like this from happening. Fixes: c6aa3542e010 ("nvmet: add error log support for file backend") Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
|
#
3aed8673 |
|
31-Jul-2019 |
Logan Gunthorpe <logang@deltatee.com> |
nvmet: Fix use-after-free bug when a port is removed When a port is removed through configfs, any connected controllers are still active and can still send commands. This causes a use-after-free bug which is detected by KASAN for any admin command that dereferences req->port (like in nvmet_execute_identify_ctrl). To fix this, disconnect all active controllers when a subsystem is removed from a port. This ensures there are no active controllers when the port is eventually removed. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
|
#
a5448fdc |
|
12-Jun-2019 |
Minwoo Im <minwoo.im.dev@gmail.com> |
nvmet: introduce target-side trace This patch introduces target-side request tracing. As Christoph suggested, the trace would not be in a core or module to avoid disadvantages like cache miss: http://lists.infradead.org/pipermail/linux-nvme/2019-June/024721.html The target-side trace code is entirely based on the Johannes's trace code from the host side. It has lots of codes duplicated, but it would be better than having advantages mentioned above. It also traces not only fabrics commands, but also nvme normal commands. Once the codes to be shared gets bigger, then we can make it common as suggsted. This also removed the create_sq and create_cq trace parsing functions because it will be done by the connect fabrics command. Example: echo 1 > /sys/kernel/debug/tracing/event/nvmet/nvmet_req_init/enable echo 1 > /sys/kernel/debug/tracing/event/nvmet/nvmet_req_complete/enable cat /sys/kernel/debug/tracing/trace Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> [hch: fixed the symbol namespace and a an endianess conversion] Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
7a1f46e3 |
|
05-Jun-2019 |
Minwoo Im <minwoo.im.dev@gmail.com> |
nvme: introduce nvme_is_fabrics to check fabrics cmd This patch introduces a nvme_is_fabrics() inline function to check whether or not the given command structure is for fabrics. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
9d09dd8d |
|
14-May-2019 |
James Smart <jsmart2021@gmail.com> |
nvmet: add transport discovery change op Some transports, such as FC-NVME, support discovery controller change events without the use of a persistent discovery controller. FC receives events via RSCN from the FC Fabric Controller or subsystem FC port. This patch adds a nvmet transport op that is called whenever a discovery change event occurs in the nvmet layer. To facilitate the callback without adding another layer to cross into core.c to reference the transport ops, the port structure snapshots the transport ops when the port is enabled and clears them when disabled. Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
a5dffbb6 |
|
23-Apr-2019 |
Enrico Weigelt, metux IT consult <info@metux.net> |
nvmet: include <linux/scatterlist.h> Build breaks: drivers/nvme/target/core.c: In function 'nvmet_req_alloc_sgl': drivers/nvme/target/core.c:939:12: error: implicit declaration of \ function 'sgl_alloc'; did you mean 'bio_alloc'? \ [-Werror=implicit-function-declaration] req->sg = sgl_alloc(req->transfer_len, GFP_KERNEL, &req->sg_cnt); ^~~~~~~~~ bio_alloc drivers/nvme/target/core.c:939:10: warning: assignment makes pointer \ from integer without a cast [-Wint-conversion] req->sg = sgl_alloc(req->transfer_len, GFP_KERNEL, &req->sg_cnt); ^ drivers/nvme/target/core.c: In function 'nvmet_req_free_sgl': drivers/nvme/target/core.c:952:3: error: implicit declaration of \ function 'sgl_free'; did you mean 'ida_free'? [-Werror=implicit-function-declaration] sgl_free(req->sg); ^~~~~~~~ ida_free Cause: 1. missing include to <linux/scatterlist.h> 2. SGL_ALLOC needs to be enabled Therefore adding the missing include, as well as Kconfig dependency. Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Minwoo Im <minwoo.im@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
6b7e631b |
|
07-Apr-2019 |
Minwoo Im <minwoo.im.dev@gmail.com> |
nvmet: return a specified error it subsys_alloc fails nvmet_subsys_alloc() returns its pointer or NULL if it fails. We can see three different steps in this function: 1. memory allocation 2. argument check 3. memory allocation for string But now the callers of this function do not seem to handle case 2 by returning -ENOMEM only even if it fails with an invalid parameter. This patch specifies error codes so that caller can pass it to its own caller. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
fc6c9730 |
|
08-Apr-2019 |
Max Gurtovoy <maxg@mellanox.com> |
nvmet: rename nvme_completion instances from rsp to cqe Use NVMe namings for improving code readability. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by : Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
013a63ef |
|
02-Apr-2019 |
Max Gurtovoy <maxg@mellanox.com> |
nvmet: add safety check for subsystem lock during nvmet_ns_changed we need to make sure that subsystem lock is taken during ctrl's list traversing. nvmet_ns_changed function is not static and can be used from various callers simultaneously. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
e84c2091 |
|
02-Apr-2019 |
Max Gurtovoy <maxg@mellanox.com> |
nvmet: never fail double namespace enablement In case we create N namespaces while N < NVMET_MAX_NAMESPACES, we can perform "echo 1 > <nsid>/enable" as much as we want. In case N == NVMET_MAX_NAMESPACES we fail. Make sure we have the same flow for any N. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
a536b497 |
|
27-Mar-2019 |
Max Gurtovoy <maxg@mellanox.com> |
nvmet: fix error flow during ns enable In case we fail to enable p2pmem on the current namespace, disable the backing store device before exiting. Cc: Stephen Bates <sbates@raithlin.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
d11de63f |
|
13-Mar-2019 |
Yufen Yu <yuyufen@huawei.com> |
nvme-loop: init nvmet_ctrl fatal_err_work when allocate After commit 4d43d395fe (workqueue: Try to catch flush_work() without INIT_WORK()), it can cause warning when delete nvme-loop device, trace like: [ 76.601272] Call Trace: [ 76.601646] ? del_timer+0x72/0xa0 [ 76.602156] __cancel_work_timer+0x1ae/0x270 [ 76.602791] cancel_work_sync+0x14/0x20 [ 76.603407] nvmet_ctrl_free+0x1b7/0x2f0 [nvmet] [ 76.604091] ? free_percpu+0x168/0x300 [ 76.604652] nvmet_sq_destroy+0x106/0x240 [nvmet] [ 76.605346] nvme_loop_destroy_admin_queue+0x30/0x60 [nvme_loop] [ 76.606220] nvme_loop_shutdown_ctrl+0xc3/0xf0 [nvme_loop] [ 76.607026] nvme_loop_delete_ctrl_host+0x19/0x30 [nvme_loop] [ 76.607871] nvme_do_delete_ctrl+0x75/0xb0 [ 76.608477] nvme_sysfs_delete+0x7d/0xc0 [ 76.609057] dev_attr_store+0x24/0x40 [ 76.609603] sysfs_kf_write+0x4c/0x60 [ 76.610144] kernfs_fop_write+0x19a/0x260 [ 76.610742] __vfs_write+0x1c/0x60 [ 76.611246] vfs_write+0xfa/0x280 [ 76.611739] ksys_write+0x6e/0x120 [ 76.612238] __x64_sys_write+0x1e/0x30 [ 76.612787] do_syscall_64+0xbf/0x3a0 [ 76.613329] entry_SYSCALL_64_after_hwframe+0x44/0xa9 We fix it by moving fatal_err_work init to nvmet_alloc_ctrl(), which may more reasonable. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
77141dc6 |
|
18-Feb-2019 |
Christoph Hellwig <hch@lst.de> |
nvmet: convert to SPDX identifiers Update license to use SPDX-License-Identifier instead of verbose license text. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
|
#
5698b805 |
|
17-Dec-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: use a macro for default error location This patch defines a new macro NVMET_NO_ERROR_LOC to represent the default error location value in the nvme-error-log-page. This is a pure cleanup patch and it does not change any functionality. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
66c6afbd |
|
14-Dec-2018 |
Colin Ian King <colin.king@canonical.com> |
nvmet: fix comparison of a u16 with -1 Currently the u16 req->error_loc is being compared to -1 which will always be false. Fix this by casting -1 to u16 to fix this. Detected by clang: warning: result of comparison of constant -1 with expression of type 'u16' (aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare] Fixes: 76574f37bf4c ("nvmet: add interface to update error-log page") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
c6aa3542 |
|
12-Dec-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add error log support for file backend This patch adds support for the file backend to populate the error log entries. Here we map the errno to the NVMe status codes. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
e81446af |
|
12-Dec-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add error log support in the core This patch adds the support to maintain error log page for the nvmet-core. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
76574f37 |
|
12-Dec-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add interface to update error-log page This patch adds nvmet_req based interface to the nvmet-core so that we can update the error log page. We update error log page in the request completion path when status is not set to NVME_SC_SUCCESS. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
e4a97625 |
|
12-Dec-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add error-log definitions This patch adds necessary fields in the target data structures to support error log page. For a target controller, we add a new error log field to maintain the error log, at any given point we maintain error entries equal to NVMET_ERROR_LOG_SLOTS for each controller. In the following patch, we also update the error log page entry in the I/O completion path so we introduce a spinlock for synchronization of the log. For nvmet_req, we add a new field error_loc to hold the location of the error in the command when the actual error occurs for each request and a starting LBA if applicable. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
cb019da3 |
|
19-Nov-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: use unlikely for req status check This patch adds unlikely in the nvmet request completion path for the status check in the low level function __nvmet_req_complete. This is helpful in the scenario where host and target connection is working smoothly. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
e6a622fd |
|
19-Nov-2018 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: support fabrics sq flow control Technical proposal 8005 "fabrics SQ flow control" introduces a mode where a host and controller agree to omit sq_head pointer updates when sending nvme completions. In case the host indicated desire to operate in this mode (connect attribute) the controller will return back a connect completion with sq_head value of 0xffff as indication that it will omit sq_head pointer updates. This mode saves us an atomic update in the I/O path. Reviewed-by: Hannes Reinecke <hare@suse.com> [hch: suggested better implementation] Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
b662a078 |
|
12-Nov-2018 |
Jay Sternberg <jay.e.sternberg@intel.com> |
nvmet: enable Discovery Controller AENs Add functions to find connections requesting Discovery Change events and send a notification to hosts that maintain an explicit persistent connection and have and active Asynchronous Event Request pending. Only Hosts that have access to the Subsystem effected by the change will receive notifications of Discovery Change event. Call these functions each time there is a configfs change that effects the Discover Log Pages. Set the OAES field in the Identify Controller response to advertise the support for Asynchronous Event Notifications. Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Phil Cayton <phil.cayton@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
253928ee |
|
12-Nov-2018 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: allow host connect even if no allowed subsystems are exported It is perfectly valid that a host connects to a discovery subsystem and gets an empty discovery log page since no subsystems are provisioned to it. No reason to disallow connecting to the discovery subsystem all together. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Phil Cayton <phil.cayton@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
f9362ac1 |
|
12-Nov-2018 |
Jay Sternberg <jay.e.sternberg@intel.com> |
nvmet: allow Keep Alive for Discovery controller Per change to specification allowing Discovery controllers to have explicit persistent connections, remove restriction on Discovery controllers allowing kato on connect. Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
7114ddeb |
|
12-Nov-2018 |
Jay Sternberg <jay.e.sternberg@intel.com> |
nvmet: change aen mask functions to use bit numbers Functions nvmet_aen_disabled and nvmet_clear_aen were using values not bit numbers ie 1 << 9 not 9 for bit function clear_bit and test_and_set_bit. Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Phil Cayton <phil.cayton@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
6c8312ad |
|
12-Nov-2018 |
Jay Sternberg <jay.e.sternberg@intel.com> |
nvmet: provide aen bit functions for multiple controller types Move nvmet_aen_disabled and nvmet_clear_aen in preparation for other types of controllers to use, initially the discovery controller. Signed-off-by: Jay Sternberg <jay.e.sternberg@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
c09305ae |
|
02-Nov-2018 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: support for traffic based keep-alive A controller that supports traffic based keep-alive can restart the keep alive timer even when no keep-alive was not received in the kato period as long as other admin or I/O commands were received. For each command set ctrl->cmd_seen to true, and when keep-alive timer expires, if any commands were seen, resched ka_work instead of escalating to a fatal error. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
21d3bbdd |
|
02-Nov-2018 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: don't try to add ns to p2p map unless it actually uses it Even without CONFIG_P2PDMA this results in a error print: nvmet: no peer-to-peer memory is available that's supported by rxe0 and /dev/nullb0 Fixes: c6925093d0b2 ("nvmet: Optionally use PCI P2P memory") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
c6925093 |
|
04-Oct-2018 |
Logan Gunthorpe <logang@deltatee.com> |
nvmet: Optionally use PCI P2P memory Create a configfs attribute in each nvme-fabrics namespace to enable P2P memory use. The attribute may be enabled (with a boolean) or a specific P2P device may be given (with the device's PCI name). When enabled, the namespace will ensure the underlying block device supports P2P and is compatible with any specified P2P device. If no device was specified it will ensure there is compatible P2P memory somewhere in the system. Enabling a namespace with P2P memory will fail with EINVAL (and an appropriate dmesg error) if any of these conditions are not met. Once a controller is set up on a specific port, the P2P device to use for each namespace will be found and stored in a radix tree by namespace ID. When memory is allocated for a request, the tree is used to look up the P2P device to allocate memory against. If no device is in the tree (because no appropriate device was found), or if allocation of P2P memory fails, fall back to using regular memory. Signed-off-by: Stephen Bates <sbates@raithlin.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> [hch: partial rewrite of the initial code] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
|
#
5b2322e4 |
|
04-Oct-2018 |
Logan Gunthorpe <logang@deltatee.com> |
nvmet: Introduce helper functions to allocate and free request SGLs Add helpers to allocate and free the SGL in a struct nvmet_req: int nvmet_req_alloc_sgl(struct nvmet_req *req) void nvmet_req_free_sgl(struct nvmet_req *req) This will be expanded in a future patch to implement peer-to-peer memory DMAs and should be common with all target drivers. The new helpers are used in nvmet-rdma. Seeing we use req.transfer_len as the length of the SGL it is set earlier and cleared on any error. It also seems to be unnecessary to accumulate the length as the map_sgl functions should only ever be called once per request. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me>
|
#
43a6f8fb |
|
08-Oct-2018 |
Bart Van Assche <bvanassche@acm.org> |
nvmet: use strcmp() instead of strncmp() for subsystem lookup strncmp() stops comparing when either the end of one of the first two arguments is reached or when 'n' characters have been compared, whichever comes first. That means that strncmp(s1, s2, n) is equivalent to strcmp(s1, s2) if n exceeds the length of s1 or the length of s2. Since that is the case in nvmet_find_get_subsys(), change strncmp() into strcmp(). This patch avoids that the following warning is reported by smatch: drivers/nvme/target/core.c:940:1 nvmet_find_get_subsys() error: strncmp() '"nqn.2014-08.org.nvmexpress.discovery"' too small (37 vs 223) Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
04db0e5e |
|
15-Aug-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: free workqueue object if module init fails Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
dedf0be5 |
|
08-Aug-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add ns write protect support This patch implements the Namespace Write Protect feature described in "NVMe TP 4005a Namespace Write Protect". In this version, we implement No Write Protect and Write Protect states for target ns which can be toggled by set-features commands from the host side. For write-protect state transition, we need to flush the ns specified as a part of command so we also add helpers for carrying out synchronous flush operations. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [hch: fixed an incorrect endianess conversion, minor cleanups] Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
62ac0d32 |
|
01-Jun-2018 |
Christoph Hellwig <hch@lst.de> |
nvmet: support configuring ANA groups Allow creating non-default ANA groups (group ID > 1). Groups are created either by assigning the group ID to a namespace, or by creating a configfs group object under a specific port. All namespaces assigned to a group that doesn't have a configfs object for a given port are marked as inaccessible. Allow changing the ANA state on a per-port basis by creating an ana_groups directory under each port, and another directory with an ana_state file in it. The default ANA group 1 directory is created automatically for each port. For all changes in ANA configuration the ANA change AEN is sent. We only keep a global changecount instead of additional per-group changecounts to keep the implementation as simple as possible. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
|
#
72efd25d |
|
19-Jul-2018 |
Christoph Hellwig <hch@lst.de> |
nvmet: add minimal ANA support Add support for Asynchronous Namespace Access as specified in NVMe 1.3 TP 4004. Just add a default ANA group 1 that is optimized on all ports. This is (and will remain) the default assignment for any namespace not epxlicitly assigned to another ANA group. The ANA state can be manually changed through the configfs interface, including the change state. Includes fixes and improvements from Hannes Reinecke. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
|
#
793c7cfc |
|
13-May-2018 |
Christoph Hellwig <hch@lst.de> |
nvmet: track and limit the number of namespaces per subsystem TP 4004 introduces a new 'Maximum Number of Allocated Namespaces' field in the Identify controller data to help the host size resources. Put an upper limit on the supported namespaces to be able to support this value as supporting 32-bits worth of namespaces would lead to very large buffers. The limit is completely arbitrary at this point. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
|
#
4ee43280 |
|
07-Jun-2018 |
Christoph Hellwig <hch@lst.de> |
nvmet: keep a port pointer in nvmet_ctrl This will be needed for the ANA AEN code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
|
#
405a7519 |
|
25-Jul-2018 |
Hannes Reinecke <hare@suse.de> |
nvmet: only check for filebacking on -ENOTBLK We only need to check for a file-backed namespace if nvmet_bdev_ns_enable() returns -ENOTBLK. For any other error it's pointless as the open() error will remain the same. Fixes: d5eff33e ("nvmet: add simple file backed ns support") Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
0d5ee2b2 |
|
20-Jun-2018 |
Steve Wise <larrystevenwise@gmail.com> |
nvmet-rdma: support max(16KB, PAGE_SIZE) inline data The patch enables inline data sizes using up to 4 recv sges, and capping the size at 16KB or at least 1 page size. So on a 4K page system, up to 16KB is supported, and for a 64K page system 1 page of 64KB is supported. We avoid > 0 order page allocations for the inline buffers by using multiple recv sges, one for each page. If the device cannot support the configured inline data size due to lack of enough recv sges, then log a warning and reduce the inline size. Add a new configfs port attribute, called param_inline_data_size, to allow configuring the size of inline data for a given nvmf port. The maximum size allowed is still enforced by nvmet-rdma with NVMET_RDMA_MAX_INLINE_DATA_SIZE, which is now max(16KB, PAGE_SIZE). And the default size, if not specified via configfs, is still PAGE_SIZE. This preserves the existing behavior, but allows larger inline sizes for small page systems. If the configured inline data size exceeds NVMET_RDMA_MAX_INLINE_DATA_SIZE, a warning is logged and the size is reduced. If param_inline_data_size is set to 0, then inline data is disabled for that nvmf port. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
55eb942e |
|
19-Jun-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add buffered I/O support for file backed ns Add a new "buffered_io" attribute, which disabled direct I/O and thus enables page cache based caching when enabled. The attribute can only be changed when the namespace is disabled as the file has to be reopend for the change to take effect. The possibly blocking read/write are deferred to a newly introduced global workqueue. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
d68a90e1 |
|
19-Jun-2018 |
Max Gurtuvoy <maxg@mellanox.com> |
nvmet: reset keep alive timer in controller enable Controllers that are not yet enabled should not really enforce keep alive timeouts, but we still want to track a timeout and cleanup in case a host died before it enabled the controller. Hence, simply reset the keep alive timer when the controller is enabled. Suggested-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
55fdd6b6 |
|
30-May-2018 |
Christoph Hellwig <hch@lst.de> |
nvmet: mask pending AENs Per section 5.2 of the NVMe 1.3 spec: "When the controller posts a completion queue entry for an outstanding Asynchronous Event Request command and thus reports an asynchronous event, subsequent events of that event type are automatically masked by the controller until the host clears that event. An event is cleared by reading the log page associated with that event using the Get Log Page command (see section 5.14)." Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
|
#
c86b8f7b |
|
30-May-2018 |
Christoph Hellwig <hch@lst.de> |
nvmet: add AEN configuration support AEN configuration via the 'Get Features' and 'Set Features' admin command is mandatory, so we should be implemeting handling for it. Signed-off-by: Hannes Reinecke <hare@suse.com> [hch: use WRITE_ONCE, check for invalid values] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
|
#
c16734ea |
|
25-May-2018 |
Christoph Hellwig <hch@lst.de> |
nvmet: implement the changed namespaces log Just keep a per-controller buffer of changed namespaces and copy it out in the get log page implementation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
|
#
c7759fff |
|
22-May-2018 |
Christoph Hellwig <hch@lst.de> |
nvmet: add a new nvmet_zero_sgl helper Zeroes the SGL in the payload. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
|
#
d5eff33e |
|
22-May-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: add simple file backed ns support This patch adds simple file backed namespace support for NVMeOF target. The new file io-cmd-file.c is responsible for handling the code for I/O commands when ns is file backed. Also, we introduce mempools based slow path using sync I/Os for file backed ns to ensure forward progress under reclaim. The old block device based implementation is moved to io-cmd-bdev.c and use a "nvmet_bdev_" symbol prefix. The enable/disable calls are also move into the respective files. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [hch: updated changelog, fixed double req->ns lookup in bdev case] Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
618cff42 |
|
10-May-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: remove duplicate NULL initialization for req->ns Remove the duplicate NULL initialization for req->ns. req->ns is always initialized to NULL in nvmet_req_init(), so there is no need to reset it later on failures unless we have previously assigned a value to it. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
b40b83e3 |
|
08-May-2018 |
Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> |
nvmet: make a few error messages more generic "nvmet_check_ctrl_status()" is called from admin-cmd.c along with io-cmd.c, make the error message more generic. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
e929f06d |
|
20-Mar-2018 |
Christoph Hellwig <hch@lst.de> |
nvmet: constify struct nvmet_fabrics_ops Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
bffd2b61 |
|
24-Jan-2018 |
Max Gurtovoy <maxg@mellanox.com> |
nvmet: fix PSDT field check in command format PSDT field section according to NVM_Express-1.3: "This field specifies whether PRPs or SGLs are used for any data transfer associated with the command. PRPs shall be used for all Admin commands for NVMe over PCIe. SGLs shall be used for all Admin and I/O commands for NVMe over Fabrics. This field shall be set to 01b for NVMe over Fabrics 1.0 implementations. Suggested-by: Idan Burstein <idanb@mellanox.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com>
|
#
423b4487 |
|
14-Jan-2018 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: release a ns reference in nvmet_req_uninit if needed nvmet_req_init looked up a namespace and took a reference on it (unless it failed prior to that). If the request is uninitialized (in error cases) we need to remove that reference in case it was taken, otherwise we leak namespace reference when calling nvme_req_uninit. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
6b1943af |
|
12-Nov-2017 |
Israel Rukshin <israelr@mellanox.com> |
nvmet: rearrange nvmet_ctrl_free() Make it symmetric to nvmet_alloc_ctrl(). Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
eca19dc1 |
|
12-Nov-2017 |
Israel Rukshin <israelr@mellanox.com> |
nvmet: fix error flow in nvmet_alloc_ctrl() Remove the allocated id on error. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
5e62d5c9 |
|
09-Nov-2017 |
Christoph Hellwig <hch@lst.de> |
nvmet: better data length validation Currently the NVMe target stores the expexted data length in req->data_len and uses that for data transfer decisions, but that does not take the actual transfer length in the SGLs into account. So this adds a new transfer_len field, into which the transport drivers store the actual transfer length. We then check the two match before actually executing the command. The FC transport driver already had such a field, which is removed in favour of the common one. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
ba2dec35 |
|
18-Oct-2017 |
Roy Shterman <roys@lightbitslabs.com> |
nvmet: Change max_nsid in subsystem due to ns_disable if needed In case we disable namespaces which has the nsid like subsystem max_nsid we need to search for the next largest nsid in this subsystem. If the subsystem don't has more namespaces we set it to 0, else we take nsid from the last namespace in namespaces list because the list is sorted while inserting. Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Roy Shterman <roys@lightbitslabs.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> [hch: slight refactor] Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
f9cf2a64 |
|
18-Oct-2017 |
James Smart <jsmart2021@gmail.com> |
nvmet: synchronize sqhd update In testing target io in read write mix, we did indeed get into cases where sqhd didn't update properly and slowly missed enough updates to shutdown the queue. Protect the updating sqhd by using cmpxchg, and for that turn the sqhd field into a u32 so that cmpxchg works on it for all architectures. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
8cbd96a6 |
|
21-Sep-2017 |
James Smart <jsmart2021@gmail.com> |
nvme: fix sqhd reference when admin queue connect fails Fix bug in sqhd patch. It wasn't the sq that was at risk. In the case where the admin queue connect command fails, the sq->size field is not set. Therefore, this becomes a divide by zero error. Add a quick check to bypass under this failure condition. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
bb1cc747 |
|
18-Sep-2017 |
James Smart <jsmart2021@gmail.com> |
nvmet: implement valid sqhd values in completions To support sqhd, for initiators that are following the spec and paying attention to sqhd vs their sqtail values: - add sqhd to struct nvmet_sq - initialize sqhd to 0 in nvmet_sq_setup - rather than propagate the 0's-based qsize value from the connect message which requires a +1 in every sqhd update, and as nothing else references it, convert to 1's-based value in nvmt_sq/cq_setup() calls. - validate connect message sqsize being non-zero per spec. - updated assign sqhd for every completion that goes back. Also remove handling the NULL sq case in __nvmet_req_complete, as it can't happen with the current code. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
ad4e05b2 |
|
13-Aug-2017 |
Max Gurtovoy <maxg@mellanox.com> |
nvme: add symbolic constants for CC identifiers Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
130c24b5 |
|
04-Aug-2017 |
Guan Junxiong <guanjunxiong@huawei.com> |
nvmet: fix the return error code of target if host is not allowed nvmf target shall return NVME_SC_CONNECT_INVALID_HOST instead of the gereal code INVALID_PARAM when the given host nqn is not allowed to connect. Refer to the 2.2.1 section of the NVMe over Fabrics Spec. Signed-off-by: Guan Junxiong <guanjunxiong@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
2e7f5d2a |
|
14-Jul-2017 |
Johannes Thumshirn <jthumshirn@suse.de> |
nvmet: Move serial number from controller to subsystem The NVMe specification defines the serial number as: "Serial Number (SN): Contains the serial number for the NVM subsystem that is assigned by the vendor as an ASCII string. Refer to section 7.10 for unique identifier requirements. Refer to section 1.5 for ASCII string requirements" So move it from the controller to the subsystem, where it belongs. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
637dc0f3 |
|
07-Jun-2017 |
Johannes Thumshirn <jthumshirn@suse.de> |
nvmet: implement namespace identify descriptor list A NVMe Identify NS command with a CNS value of '3' is expecting a list of Namespace Identification Descriptor structures to be returned to the host for the namespace requested in the namespace identify command. This Namespace Identification Descriptor structure consists of the type of the namespace identifier, the length of the identifier and the actual identifier. Valid types are NGUID and UUID which we have saved in our nvme_ns structure if they have been configured via configfs. If no value has been assigened to one of these we return an "invalid opcode" back to the host to maintain backward compatibiliy with older implementations without Namespace Identify Descriptor list support. Also as the Namespace Identify Descriptor list is the only mandatory feature change between 1.2.1 and 1.3 we can bump the advertised version as well. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
549f01ae |
|
08-May-2017 |
Vijay Immanuel <vijayi@attalasystems.com> |
nvmet: release the sq ref on rdma read errors On rdma read errors, release the sq ref that was taken when the req was initialized. This avoids a hang in nvmet_sq_destroy() when the queue is being freed. Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
64a0ca88 |
|
27-Feb-2017 |
Parav Pandit <parav@mellanox.com> |
nvmet: Introduced helper routine for controller status check. This patch introduces helper function for checking controller status during admin and io command processing which returns u16 status. As to bring consistency on returning status, other friend functions also now return u16 status instead of int to match the spec. As part of the theseerror log prints in also prints qid on which command error occured. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
4151dd9a |
|
27-Feb-2017 |
Parav Pandit <parav@mellanox.com> |
nvmet: Fixed avoided printing nvmet: twice in error logs. This patch avoids printing "nvmet:" twice in error logs as its already coming through pr_fmt macro. Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
427242ce |
|
06-Mar-2017 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: confirm sq percpu has scheduled and switched to atomic percpu_ref_kill is not enough to prevent subsequent percpu_ref_tryget_live from failing. Hence call perfcpu_ref_kill_confirm to make it safe. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
d11ea004 |
|
06-Mar-2017 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: confirm sq percpu has scheduled and switched to atomic percpu_ref_kill is not enough to prevent subsequent percpu_ref_tryget_live from failing. Hence call perfcpu_ref_kill_confirm to make it safe. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
|
#
b2d09103 |
|
03-Feb-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/headers: Prepare to use <linux/rcuupdate.h> instead of <linux/rculist.h> in <linux/sched.h> We don't actually need the full rculist.h header in sched.h anymore, we will be able to include the smaller rcupdate.h header instead. But first update code that relied on the implicit header inclusion. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
15fbad96 |
|
14-Nov-2016 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: Make cntlid globally unique We usually log the cntlid which is confusing in case we have multiple subsystems each with it's own cntlid ida. Instead make cntlid ida globally unique and log the initial association. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
23a8ed4a |
|
01-Jan-2017 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: Call fatal_error from keep-alive timout expiration We only need to call delete_ctrl once, so given that both keep-alive timeout and any other fatal error can trigger it, just make sure we only call delete_ctrl once. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
06406d81 |
|
01-Jan-2017 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: cancel fatal error and flush async work before free controller Make sure they are not running and we can free the controller safely. Signed-off-by: Roy Shterman <roys@lightbitslabs.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
344770b0 |
|
27-Nov-2016 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: delete controllers deletion upon subsystem release No reason for them to be kept around if we are deleting the subsystem, so instead of passively wait for the host to disconnect, actively delete the controllers. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
e4fcf07c |
|
30-Oct-2016 |
Solganik Alexander <sashas@lightbitslabs.com> |
nvmet: Fix possible infinite loop triggered on hot namespace removal When removing a namespace we delete it from the subsystem namespaces list with list_del_init which allows us to know if it is enabled or not. The problem is that list_del_init initialize the list next and does not respect the RCU list-traversal we do on the IO path for locating a namespace. Instead we need to use list_del_rcu which is allowed to run concurrently with the _rcu list-traversal primitives (keeps list next intact) and guarantees concurrent nvmet_find_naespace forward progress. By changing that, we cannot rely on ns->dev_link for knowing if the namspace is enabled, so add enabled indicator entry to nvmet_ns for that. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Solganik Alexander <sashas@lightbitslabs.com> Cc: <stable@vger.kernel.org> # v4.8+
|
#
8242ddac |
|
06-Nov-2016 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: Don't queue fatal error work if csts.cfs is set In the transport, in case of an interal queue error like error completion in rdma we trigger a fatal error. However, multiple queues in the same controller can serr error completions and we don't want to trigger fatal error work more than once. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
|
#
d49187e9 |
|
10-Nov-2016 |
Christoph Hellwig <hch@lst.de> |
nvme: introduce struct nvme_request This adds a shared per-request structure for all NVMe I/O. This structure is embedded as the first member in all NVMe transport drivers request private data and allows to implement common functionality between the drivers. The first use is to replace the current abuse of the SCSI command passthrough fields in struct request for the NVMe command passthrough, but it will grow a field more fields to allow implementing things like common abort handlers in the future. The passthrough commands are handled by having a pointer to the SQE (struct nvme_command) in struct nvme_request, and the union of the possible result fields, which had to be turned from an anonymous into a named union for that purpose. This avoids having to pass a reference to a full CQE around and thus makes checking the result a lot more lightweight. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
8ef2074d |
|
19-Oct-2016 |
Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> |
nvme: Add tertiary number to NVME_VS NVMe 1.2.1 specification adds a tertiary element to the version number. This updates the macro and its callers to include the final number and fixup a single place in nvmet where the version was generated manually. Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
28b89118 |
|
04-Aug-2016 |
Sagi Grimberg <sagi@grimberg.me> |
nvmet: Fix controller serial number inconsistency The host is allowed to issue identify as many times as it wants, we need to stay consistent when reporting the serial number for a given controller. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
69555af2 |
|
05-Jul-2016 |
Wei Yongjun <yongjun_wei@trendmicro.com.cn> |
nvmet: fix return value check in nvmet_subsys_alloc() In case of error, the function kstrndup() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Jay Freyensee <james_p_freyensee@linux.intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
a07b4970 |
|
21-Jun-2016 |
Christoph Hellwig <hch@lst.de> |
nvmet: add a generic NVMe target This patch introduces a implementation of NVMe subsystems, controllers and discovery service which allows to export NVMe namespaces across fabrics such as Ethernet, FC etc. The implementation conforms to the NVMe 1.2.1 specification and interoperates with NVMe over fabrics host implementations. Configuration works using configfs, and is best performed using the nvmetcli tool from http://git.infradead.org/users/hch/nvmetcli.git, which also has a detailed explanation of the required steps in the README file. Signed-off-by: Armen Baloyan <armenx.baloyan@intel.com> Signed-off-by: Anthony Knapp <anthony.j.knapp@intel.com> Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|