#
77c05922 |
|
15-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
btt: pass queue_limits to blk_mq_alloc_disk Pass the queue limits directly to blk_alloc_disk instead of setting them one at a time. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20240215071055.2201424-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
74fa8f9c |
|
15-Feb-2024 |
Christoph Hellwig <hch@lst.de> |
block: pass a queue_limits argument to blk_alloc_disk Pass a queue_limits to blk_alloc_disk and apply it if non-NULL. This will allow allocating queues with valid queue limits instead of setting the values one at a time later. Also change blk_alloc_disk to return an ERR_PTR instead of just NULL which can't distinguish errors. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Link: https://lore.kernel.org/r/20240215071055.2201424-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
b1921141 |
|
07-Dec-2023 |
Randy Dunlap <rdunlap@infradead.org> |
nvdimm/btt: fix btt_blk_cleanup() kernel-doc Correct the function parameters to prevent kernel-doc warnings: btt.c:1567: warning: Function parameter or member 'nd_region' not described in 'btt_init' btt.c:1567: warning: Excess function parameter 'maxlane' description in 'btt_init' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: <nvdimm@lists.linux.dev> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20231207210545.24056-1-rdunlap@infradead.org Signed-off-by: Ira Weiny <ira.weiny@intel.com>
|
#
9aa6543e |
|
14-Dec-2023 |
Dinghao Liu <dinghao.liu@zju.edu.cn> |
nvdimm-btt: simplify code with the scope based resource management Use the scope based resource management (defined in linux/cleanup.h) to automate resource lifetime control on struct btt_sb *super in discover_arenas(). Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231214083919.22218-1-dinghao.liu@zju.edu.cn Signed-off-by: Ira Weiny <ira.weiny@intel.com>
|
#
ab7e8bb6 |
|
19-Oct-2023 |
Justin Stitt <justinstitt@google.com> |
nvdimm/btt: replace deprecated strncpy with strscpy Found with grep. strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. We expect super->signature to be NUL-terminated based on its usage with memcmp against a NUL-term'd buffer: btt_devs.c: 253 | if (memcmp(super->signature, BTT_SIG, BTT_SIG_LEN) != 0) btt.h: 13 | #define BTT_SIG "BTT_ARENA_INFO\0" NUL-padding is not required as `super` is already zero-allocated: btt.c: 985 | super = kzalloc(sizeof(struct btt_sb), GFP_NOIO); ... rendering any additional NUL-padding superfluous. Considering the above, a suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Let's also use the more idiomatic strscpy usage of (dest, src, sizeof(dest)) instead of (dest, src, XYZ_LEN) for buffers that the compiler can determine the size of. This more tightly correlates the destination buffer to the amount of bytes copied. Side note, this pattern of memcmp() on two NUL-terminated strings should really be changed to just a strncmp(), if i'm not mistaken? I see multiple instances of this pattern in this system: | if (memcmp(super->signature, BTT_SIG, BTT_SIG_LEN) != 0) | return false; where BIT_SIG is defined (weirdly) as a double NUL-terminated string: | #define BTT_SIG "BTT_ARENA_INFO\0" Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231019-strncpy-drivers-nvdimm-btt-c-v2-1-366993878cf0@google.com Signed-off-by: Kees Cook <keescook@chromium.org>
|
#
3222d8c2 |
|
25-Jan-2023 |
Christoph Hellwig <hch@lst.de> |
block: remove ->rw_page The ->rw_page method is a special purpose bypass of the usual bio handling path that is limited to single-page reads and writes and synchronous which causes a lot of extra code in the drivers, callers and the block layer. The only remaining user is the MM swap code. Switch that swap code to simply submit a single-vec on-stack bio an synchronously wait on it based on a newly added QUEUE_FLAG_SYNCHRONOUS flag set by the drivers that currently implement ->rw_page instead. While this touches one extra cache line and executes extra code, it simplifies the block layer and drivers and ensures that all feastures are properly supported by all drivers, e.g. right now ->rw_page bypassed cgroup writeback entirely. [akpm@linux-foundation.org: fix comment typo, per Dan] Link: https://lkml.kernel.org/r/20230125133436.447864-8-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <kbusch@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
#
ba229aa8 |
|
14-Jul-2022 |
Bart Van Assche <bvanassche@acm.org> |
nvdimm-btt: Use the enum req_op type Improve static type checking by using the enum req_op type where appropriate. Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220714180729.1065367-20-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
86947df3 |
|
14-Jul-2022 |
Bart Van Assche <bvanassche@acm.org> |
block: Change the type of the last .rw_page() argument All .rw_page() callers pass an enum req_op value as last argument. Make this explicit by changing the type of the last argument into enum req_op. See also commit 3f289dcb4b26 ("block: make bdev_ops->rw_page() take a REQ_OP instead of bool"). Cc: Tejun Heo <tj@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20220714180729.1065367-4-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
8b9ab626 |
|
19-Jun-2022 |
Christoph Hellwig <hch@lst.de> |
block: remove blk_cleanup_disk blk_cleanup_disk is nothing but a trivial wrapper for put_disk now, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
32051906 |
|
03-Mar-2022 |
Christoph Hellwig <hch@lst.de> |
nvdimm-btt: use bvec_kmap_local in btt_rw_integrity Using local kmaps slightly reduces the chances to stray writes, and the bvec interface cleans up the code a little bit. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220303111905.321089-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
322cbb50 |
|
24-Jan-2022 |
Christoph Hellwig <hch@lst.de> |
block: remove genhd.h There is no good reason to keep genhd.h separate from the main blkdev.h header that includes it. So fold the contents of genhd.h into blkdev.h and remove genhd.h entirely. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220124093913.742411-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
16be7974 |
|
03-Nov-2021 |
Luis Chamberlain <mcgrof@kernel.org> |
nvdimm/btt: add error handling support for add_disk() We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211103230437.1639990-3-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
2762ff06 |
|
03-Nov-2021 |
Luis Chamberlain <mcgrof@kernel.org> |
nvdimm/btt: use goto error labels on btt_blk_init() This will make it easier to share common error paths. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211103230437.1639990-2-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
3aefb5ee |
|
03-Nov-2021 |
Luis Chamberlain <mcgrof@kernel.org> |
nvdimm/btt: do not call del_gendisk() if not needed del_gendisk() should not called if the disk has not been added. Fix this. Fixes: 41cd8b70c37a ("libnvdimm, btt: add support for blk integrity") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211103165843.1402142-1-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
d1c6e08e |
|
08-Sep-2021 |
Dan Williams <dan.j.williams@intel.com> |
libnvdimm/labels: Add uuid helpers In preparation for CXL labels that move the uuid to a different offset in the label, add nsl_{ref,get,validate}_uuid(). These helpers use the proper uuid_t type. That type definition predated the libnvdimm subsystem, so now is as a good a time as any to convert all the uuid handling in the subsystem to uuid_t to match the helpers. Note that the uuid fields in the label data and superblocks is not replaced per Andy's expectation that uuid_t is a kernel internal type not to appear in external ABI interfaces. So, in those case {import,export}_uuid() is used to go between the 2 types. Also note that this rework uncovered some unnecessary copies for label comparisons, those are cleaned up with nsl_uuid_equal(). As for the whitespace changes, all new code is clang-format compliant. Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/163116429748.2460985.15659993454313919977.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
3e08773c |
|
12-Oct-2021 |
Christoph Hellwig <hch@lst.de> |
block: switch polling to be bio based Replace the blk_poll interface that requires the caller to keep a queue and cookie from the submissions with polling based on the bio. Polling for the bio itself leads to a few advantages: - the cookie construction can made entirely private in blk-mq.c - the caller does not need to remember the request_queue and cookie separately and thus sidesteps their lifetime issues - keeping the device and the cookie inside the bio allows to trivially support polling BIOs remapping by stacking drivers - a lot of code to propagate the cookie back up the submission path can be removed entirely. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mark Wunderlich <mark.wunderlich@intel.com> Link: https://lore.kernel.org/r/20211012111226.760968-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
d4e4e583 |
|
20-May-2021 |
Christoph Hellwig <hch@lst.de> |
nvdimm-btt: convert to blk_alloc_disk/blk_cleanup_disk Convert the nvdimm-btt driver to use the blk_alloc_disk and blk_cleanup_disk helpers to simplify gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-17-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
0d1feb72 |
|
20-May-2021 |
Christoph Hellwig <hch@lst.de> |
block: automatically enable GENHD_FL_EXT_DEVT Automatically set the GENHD_FL_EXT_DEVT flag for all disks allocated without an explicit number of minors. This is what all new block drivers should do, so make sure it is the default without boilerplate code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20210521055116.1053587-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
4ee60ec1 |
|
06-May-2021 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
include: remove pagemap.h from blkdev.h My UEK-derived config has 1030 files depending on pagemap.h before this change. Afterwards, just 326 files need to be rebuilt when I touch pagemap.h. I think blkdev.h is probably included too widely, but untangling that dependency is harder and this solves my problem. x86 allmodconfig builds, but there may be implicit include problems on other architectures. Link: https://lkml.kernel.org/r/20210309195747.283796-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Dan Williams <dan.j.williams@intel.com> [nvdimm] Acked-by: Jens Axboe <axboe@kernel.dk> [block] Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: Martin K. Petersen <martin.petersen@oracle.com> [scsi] Reviewed-by: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
309dca30 |
|
24-Jan-2021 |
Christoph Hellwig <hch@lst.de> |
block: store a block_device pointer in struct bio Replace the gendisk pointer in struct bio with a pointer to the newly improved struct block device. From that the gendisk can be trivially accessed with an extra indirection, but it also allows to directly look up all information related to partition remapping. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
a8b456d0 |
|
24-Sep-2020 |
Christoph Hellwig <hch@lst.de> |
bdi: remove BDI_CAP_SYNCHRONOUS_IO BDI_CAP_SYNCHRONOUS_IO is only checked in the swap code, and used to decided if ->rw_page can be used on a block device. Just check up for the method instead. The only complication is that zram needs a second set of block_device_operations as it can switch between modes that actually support ->rw_page and those who don't. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
32f61d67 |
|
01-Sep-2020 |
Christoph Hellwig <hch@lst.de> |
nvdimm: simplify revalidate_disk handling The nvdimm block driver abuse revalidate_disk in a strange way, and totally unrelated to what other drivers do. Simplify this by just calling nvdimm_revalidate_disk (which seems rather misnamed) from the probe routines, as the additional bdev size revalidation is pointless at this point, and remove the revalidate_disk methods given that it can only be triggered from add_disk, which is right before the manual calls. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
af3bbc12 |
|
14-Aug-2020 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
mm: add thp_size This function returns the number of bytes in a THP. It is like page_size(), but compiles to just PAGE_SIZE if CONFIG_TRANSPARENT_HUGEPAGE is disabled. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Link: http://lkml.kernel.org/r/20200629151959.15779-5-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
c62b37d9 |
|
01-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
block: move ->make_request_fn to struct block_device_operations The make_request_fn is a little weird in that it sits directly in struct request_queue instead of an operation vector. Replace it with a block_device_operations method called submit_bio (which describes much better what it does). Also remove the request_queue argument to it, as the queue can be derived pretty trivially from the bio. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
0fd92f89 |
|
26-May-2020 |
Christoph Hellwig <hch@lst.de> |
nvdimm: use bio_{start,end}_io_acct Switch dm to use the nicer bio accounting helpers. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
5713bcc3 |
|
08-May-2020 |
Christoph Hellwig <hch@lst.de> |
nvdimm/btt: stop using ->queuedata In preparation for removing queuedata as an argument to make_request_fn() drop the dependency ->queuedata. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20200508161517.252308-15-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
3d745ea5 |
|
27-Mar-2020 |
Christoph Hellwig <hch@lst.de> |
block: simplify queue allocation Current make_request based drivers use either blk_alloc_queue_node or blk_alloc_queue to allocate a queue, and then set up the make_request_fn function pointer and a few parameters using the blk_queue_make_request helper. Simplify this by passing the make_request pointer to blk_alloc_queue, and while at it merge the _node variant into the main helper by always passing a node_id, and remove the superfluous gfp_mask parameter. A lower-level __blk_alloc_queue is kept for the blk-mq case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
4e24e37d |
|
31-Oct-2019 |
Qian Cai <cai@lca.pw> |
libnvdimm/btt: fix variable 'rc' set but not used drivers/nvdimm/btt.c: In function 'btt_read_pg': drivers/nvdimm/btt.c:1264:8: warning: variable 'rc' set but not used [-Wunused-but-set-variable] int rc; ^~ Add a ratelimited message in case a storm of errors is encountered. Fixes: d9b83c756953 ("libnvdimm, btt: rework error clearing") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/1572530719-32161-1-git-send-email-cai@lca.pw Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
8f4b01fc |
|
31-Oct-2019 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
libnvdimm/namespace: Differentiate between probe mapping and runtime mapping The nvdimm core currently maps the full namespace to an ioremap range while probing the namespace mode. This can result in probe failures on architectures that have limited ioremap space. For example, with a large btt namespace that consumes most of I/O remap range, depending on the sequence of namespace initialization, the user can find a pfn namespace initialization failure due to unavailable I/O remap space which nvdimm core uses for temporary mapping. nvdimm core can avoid this failure by only mapping the reserved info block area to check for pfn superblock type and map the full namespace resource only before using the namespace. Given that personalities like BTT can be layered on top of any namespace type create a generic form of devm_nsio_enable (devm_namespace_enable) and use it inside the per-personality attach routines. Now devm_namespace_enable() is always paired with disable unless the mapping is going to be used for long term runtime access. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/20191017073308.32645-1-aneesh.kumar@linux.ibm.com [djbw: reworks to move devm_namespace_{en,dis}able into *attach helpers] Reported-by: kbuild test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20191031105741.102793-2-aneesh.kumar@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
86aa6668 |
|
09-Aug-2019 |
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> |
libnvdimm: Fix endian conversion issues nd_label->dpa issue was observed when trying to enable the namespace created with little-endian kernel on a big-endian kernel. That made me run `sparse` on the rest of the code and other changes are the result of that. Fixes: d9b83c756953 ("libnvdimm, btt: rework error clearing") Fixes: 9dedc73a4658 ("libnvdimm/btt: Fix LBA masking during 'free list' population") Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Link: https://lore.kernel.org/r/20190809074726.27815-1-aneesh.kumar@linux.ibm.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
2025cf9e |
|
29-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 263 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
9dedc73a |
|
27-Feb-2019 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm/btt: Fix LBA masking during 'free list' population The Linux BTT implementation assumes that log entries will never have the 'zero' flag set, and indeed it never sets that flag for log entries itself. However, the UEFI spec is ambiguous on the exact format of the LBA field of a log entry, specifically as to whether it should include the additional flag bits or not. While a zero bit doesn't make sense in the context of a log entry, other BTT implementations might still have it set. If an implementation does happen to have it set, we would happily read it in as the next block to write to for writes. Since a high bit is set, it pushes the block number out of the range of an 'arena', and we fail such a write with an EIO. Follow the robustness principle, and tolerate such implementations by stripping out the zero flag when populating the free list during initialization. Additionally, use the same stripped out entries for detection of incomplete writes and map restoration that happens at this stage. Add a sysfs file 'log_zero_flags' that indicates the ability to accept such a layout to userspace applications. This enables 'ndctl check-namespace' to recognize whether the kernel is able to handle zero flags, or whether it should attempt a fix-up under the --repair option. Cc: Dan Williams <dan.j.williams@intel.com> Reported-by: Dexuan Cui <decui@microsoft.com> Reported-by: Pedro d'Aquino Filocre F S Barbuda <pbarbuda@microsoft.com> Tested-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
2f8c9011 |
|
27-Feb-2019 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm/btt: Remove unnecessary code in btt_freelist_init We call btt_log_read() twice, once to get the 'old' log entry, and again to get the 'new' entry. However, we have no use for the 'old' entry, so remove it. Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
fef912bf |
|
28-Sep-2018 |
Hannes Reinecke <hare@suse.de> |
block: genhd: add 'groups' argument to device_add_disk Update device_add_disk() to take an 'groups' argument so that individual drivers can register a device with additional sysfs attributes. This avoids race condition the driver would otherwise have if these groups were to be created with sysfs_add_groups(). Signed-off-by: Martin Wilck <martin.wilck@suse.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
3f289dcb |
|
18-Jul-2018 |
Tejun Heo <tj@kernel.org> |
block: make bdev_ops->rw_page() take a REQ_OP instead of bool c11f0c0b5bb9 ("block/mm: make bdev_ops->rw_page() take a bool for read/write") replaced @op with boolean @is_write, which limited the amount of information going into ->rw_page() and more importantly page_endio(), which removed the need to expose block internals to mm. Unfortunately, we want to track discards separately and @is_write isn't enough information. This patch updates bdev_ops->rw_page() to take REQ_OP instead but leaves page_endio() to take bool @is_write. This allows the block part of operations to have enough information while not leaking it to mm. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
8b904b5b |
|
07-Mar-2018 |
Bart Van Assche <bvanassche@acm.org> |
block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() This patch has been generated as follows: for verb in set_unlocked clear_unlocked set clear; do replace-in-files queue_flag_${verb} blk_queue_flag_${verb%_unlocked} \ $(git grep -lw queue_flag_${verb} drivers block/bsg*) done Except for protecting all queue flag changes with the queue lock this patch does not change any functionality. Cc: Mike Snitzer <snitzer@redhat.com> Cc: Shaohua Li <shli@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
3ffb0ba9 |
|
05-Mar-2018 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, {btt, blk}: do integrity setup before add_disk() Prior to 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk") we needed to temporarily add a zero-capacity disk before registering for blk-integrity. But adding a zero-capacity disk caused the partition table scanning to bail early, and this resulted in partitions not coming up after a probe of the BTT or blk namespaces. We can now register for integrity before the disk has been added, and this fixes the rescan problems. Fixes: 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk") Reported-by: Dariusz Dokupil <dariusz.dokupil@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d08cd5e0 |
|
13-Dec-2017 |
Jeff Moyer <jmoyer@redhat.com> |
libnvdimm, btt: fix uninitialized err_lock When a sector mode namespace is initially created, the arena's err_lock is not initialized. If, on the other hand, the namespace already exists, the mutex is initialized. To fix the issue, I moved the mutex initialization into the arena_alloc, which is called by both discover_arenas and create_arenas. This was discovered on an older kernel where mutex_trylock checks the count to determine whether the lock is held. Because the data structure is kzalloc-d, that count was 0 (held), and I/O to the device would hang forever waiting for the lock to be released (see btt_write_pg, for example). Current kernels have a different mutex implementation that checks for a non-null owner, and so this doesn't show up as a problem. If that lock were ever contended, it might cause issues, but you'd have to be really unlucky, I think. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
24e3a7fb |
|
18-Dec-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: Fix an incompatibility in the log layout Due to a spec misinterpretation, the Linux implementation of the BTT log area had different padding scheme from other implementations, such as UEFI and NVML. This fixes the padding scheme, and defaults to it for new BTT layouts. We attempt to detect the padding scheme in use when probing for an existing BTT. If we detect the older/incompatible scheme, we continue using it. Reported-by: Juston Li <juston.li@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Fixes: 5212e11fde4d ("nd_btt: atomic sector updates") Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
23c47d2a |
|
15-Nov-2017 |
Minchan Kim <minchan@kernel.org> |
bdi: introduce BDI_CAP_SYNCHRONOUS_IO As discussed at https://lkml.kernel.org/r/<20170728165604.10455-1-ross.zwisler@linux.intel.com> someday we will remove rw_page(). If so, we need something to detect such super-fast storage on which synchronous IO operations like the current rw_page are always a win. Introduces BDI_CAP_SYNCHRONOUS_IO to indicate such devices. With it, we could use various optimization techniques. Link: http://lkml.kernel.org/r/1505886205-9671-3-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
04c3c982 |
|
08-Sep-2017 |
Randy Dunlap <rdunlap@infradead.org> |
libnvdimm, btt: fix format string warnings Fix format warnings (seen on i386) in nvdimm/btt.c: ../drivers/nvdimm/btt.c: In function ‘btt_map_init’: ../drivers/nvdimm/btt.c:430:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘size_t’ [-Wformat=] dev_WARN_ONCE(to_dev(arena), size < 512, ^ ../drivers/nvdimm/btt.c: In function ‘btt_log_init’: ../drivers/nvdimm/btt.c:474:3: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘size_t’ [-Wformat=] dev_WARN_ONCE(to_dev(arena), size < 512, ^ Fixes: 86652d2eb347 ("libnvdimm, btt: clean up warning and error messages") Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
86652d2e |
|
05-Sep-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: clean up warning and error messages Convert all WARN* style messages to dev_WARN, and for errors in the IO paths, use dev_err_ratelimited. Also remove some BUG_ONs in the IO path and replace them with the above - no need to crash the machine in case of an unaligned IO. Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
98cc093c |
|
06-Sep-2017 |
Huang Ying <ying.huang@intel.com> |
block, THP: make block_device_operations.rw_page support THP The .rw_page in struct block_device_operations is used by the swap subsystem to read/write the page contents from/into the corresponding swap slot in the swap device. To support the THP (Transparent Huge Page) swap optimization, the .rw_page is enhanced to support to read/write THP if possible. Link: http://lkml.kernel.org/r/20170724051840.2309-6-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Ross Zwisler <ross.zwisler@intel.com> [for brd.c, zram_drv.c, pmem.c] Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vishal L Verma <vishal.l.verma@intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Shaohua Li <shli@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
d9b83c75 |
|
30-Aug-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: rework error clearing Clearing errors or badblocks during a BTT write requires sending an ACPI DSM, which means potentially sleeping. Since a BTT IO happens in atomic context (preemption disabled, spinlocks may be held), we cannot perform error clearing in the course of an IO. Due to this error clearing for BTT IOs has hitherto been disabled. In this patch we move error clearing out of the atomic section, and thus re-enable error clearing with BTTs. When we are about to add a block to the free list, we check if it was previously marked as an error, and if it was, we add it to the freelist, but also set a flag that says error clearing will be required. We then drop the lane (ending the atomic context), and send a zero buffer so that the error can be cleared. The error flag in the free list is protected by the nd 'lane', and is set only be a thread while it holds that lane. When the error is cleared, the flag is cleared, but while holding a mutex for that freelist index. When writing, we check for two things - 1/ If the freelist mutex is held or if the error flag is set. If so, this is an error block that is being (or about to be) cleared. 2/ If the block is a known badblock based on nsio->bb The second check is required because the BTT map error flag for a map entry only gets set when an error LBA is read. If we write to a new location that may not have the map error flag set, but still might be in the region's badblock list, we can trigger an EIO on the write, which is undesirable and completely avoidable. Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
75892004 |
|
30-Aug-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: cache sector_size in arena_info In preparation for the error clearing rework, add sector_size in the arena_info struct. Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
1398199d |
|
30-Aug-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: ensure that flags were also unchanged during a map_read In btt_map_read, we read the map twice to make sure that the map entry didn't change after we added it to the read tracking table. In anticipation of expanding the use of the error bit, also make sure that the error and zero flags are constant across the two map reads. Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
0595d539 |
|
30-Aug-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: refactor map entry operations with macros Add helpers for converting a raw map entry to just the block number, or either of the 'e' or 'z' flags in preparation for actually using the error flag to mark blocks with media errors. Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
1db1f3ce |
|
30-Aug-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: fix a missed NVDIMM_IO_ATOMIC case in the write path The IO context conversion for rw_bytes missed a case in the BTT write path (btt_map_write) which should've been marked as atomic. In reality this should not cause a problem, because map writes are to small for nsio_rw_bytes to attempt error clearing, but it should be fixed for posterity. Add a might_sleep() in the non-atomic section of nsio_rw_bytes so that things like the nfit unit tests, which don't actually sleep, can catch bugs like this. Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
ed36b4db |
|
27-Aug-2017 |
Christophe Jaillet <christophe.jaillet@wanadoo.fr> |
libnvdimm, btt: check memory allocation failure Check memory allocation failures and return -ENOMEM in such cases, as already done few lines below for another memory allocation. This avoids NULL pointers dereference. Cc: <stable@vger.kernel.org> Fixes: 14e494542636 ("libnvdimm, btt: BTT updates for UEFI 2.7 format") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
b1fb2c52 |
|
29-Jun-2017 |
Dmitry Monakhov <dmonakhov@openvz.org> |
block: guard bvec iteration logic Currently if some one try to advance bvec beyond it's size we simply dump WARN_ONCE and continue to iterate beyond bvec array boundaries. This simply means that we endup dereferencing/corrupting random memory region. Sane reaction would be to propagate error back to calling context But bvec_iter_advance's calling context is not always good for error handling. For safity reason let truncate iterator size to zero which will break external iteration loop which prevent us from unpredictable memory range corruption. And even it caller ignores an error, it will corrupt it's own bvecs, not others. This patch does: - Return error back to caller with hope that it will react on this - Truncate iterator size Code was added long time ago here 4550dd6c, luckily no one hit it in real life :) Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> [hch: switch to true/false returns instead of errno values] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
e23947bd |
|
29-Jun-2017 |
Dmitry Monakhov <dmonakhov@openvz.org> |
bio-integrity: fold bio_integrity_enabled to bio_integrity_prep Currently all integrity prep hooks are open-coded, and if prepare fails we ignore it's code and fail bio with EIO. Let's return real error to upper layer, so later caller may react accordingly. In fact no one want to use bio_integrity_prep() w/o bio_integrity_enabled, so it is reasonable to fold it in to one function. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> [hch: merged with the latest block tree, return bool from bio_integrity_prep] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
e6be2dcb |
|
30-Jun-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: convert some info messages to warn/err Some critical messages such as IO errors, metadata failures were printed with dev_info. Make them louder by upgrading them to dev_warn or dev_error. Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
c13c43d5 |
|
29-Jun-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: fix btt_rw_page not returning errors btt_rw_page was not propagating errors frm btt_do_bvec, resulting in any IO errors via the rw_page path going unnoticed. the pmem driver recently fixed this in e10624f pmem: fail io-requests to known bad blocks but same problem in BTT went neglected. Fixes: 5212e11fde4d ("nd_btt: atomic sector updates") Cc: <stable@vger.kernel.org> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
14e49454 |
|
28-Jun-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: BTT updates for UEFI 2.7 format The UEFI 2.7 specification defines an updated BTT metadata format, bumping the revision to 2.0. Add support for the new format, while retaining compatibility for the old 1.1 format. Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Linda Knippers <linda.knippers@hpe.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
0b0bcacc |
|
19-Jun-2017 |
Christoph Hellwig <hch@lst.de> |
block: don't bother with bounce limits for make_request drivers We only call blk_queue_bounce for request-based drivers, so stop messing with it for make_request based drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
4e4cbee9 |
|
03-Jun-2017 |
Christoph Hellwig <hch@lst.de> |
block: switch bios to blk_status_t Replace bi_error with a new bi_status to allow for a clear conversion. Note that device mapper overloaded bi_error with a private value, which we'll have to keep arround at least for now and thus propagate to a proper blk_status_t value. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
b177fe85 |
|
10-May-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: ensure that initializing metadata clears poison If we had badblocks/poison in the metadata area of a BTT, recreating the BTT would not clear the poison in all cases, notably the flog area. This is because rw_bytes will only clear errors if the request being sent down is 512B aligned and sized. Make sure that when writing the map and info blocks, the rw_bytes being sent are of the correct size/alignment. For the flog, instead of doing the smaller log_entry writes only, first do a 'wipe' of the entire area by writing zeroes in large enough chunks so that errors get cleared. Cc: Andy Rudoff <andy.rudoff@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
3ae3d67b |
|
10-May-2017 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm: add an atomic vs process context flag to rw_bytes nsio_rw_bytes can clear media errors, but this cannot be done while we are in an atomic context due to locking within ACPI. From the BTT, ->rw_bytes may be called either from atomic or process context depending on whether the calls happen during initialization or during IO. During init, we want to ensure error clearing happens, and the flag marking process context allows nsio_rw_bytes to do that. When called during IO, we're in atomic context, and error clearing can be skipped. Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
abe8b4e3 |
|
27-Jul-2016 |
Vishal Verma <vishal.l.verma@intel.com> |
nvdimm, btt: add a size attribute for BTTs To be consistent with other namespaces, expose a 'size' attribute for BTT devices also. Cc: Dan Williams <dan.j.williams@intel.com> Reported-by: Linda Knippers <linda.knippers@hpe.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
c11f0c0b |
|
05-Aug-2016 |
Jens Axboe <axboe@fb.com> |
block/mm: make bdev_ops->rw_page() take a bool for read/write Commit abf545484d31 changed it from an 'rw' flags type to the newer ops based interface, but now we're effectively leaking some bdev internals to the rest of the kernel. Since we only care about whether it's a read or a write at that level, just pass in a bool 'is_write' parameter instead. Then we can also move op_is_write() and friends back under CONFIG_BLOCK protection. Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
abf54548 |
|
04-Aug-2016 |
Mike Christie <mchristi@redhat.com> |
mm/block: convert rw_page users to bio op use The rw_page users were not converted to use bio/req ops. As a result bdev_write_page is not passing down REQ_OP_WRITE and the IOs will be sent down as reads. Signed-off-by: Mike Christie <mchristi@redhat.com> Fixes: 4e1b2d52a80d ("block, fs, drivers: remove REQ_OP compat defs and related code") Modified by me to: 1) Drop op_flags passing into ->rw_page(), as we don't use it. 2) Make op_is_write() and friends safe to use for !CONFIG_BLOCK Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
0d52c756 |
|
15-Jun-2016 |
Dan Williams <dan.j.williams@intel.com> |
block: convert to device_add_disk() For block drivers that specify a parent device, convert them to use device_add_disk(). This conversion was done with the following semantic patch: @@ struct gendisk *disk; expression E; @@ - disk->driverfs_dev = E; ... - add_disk(disk); + device_add_disk(E, disk); @@ struct gendisk *disk; expression E1, E2; @@ - disk->driverfs_dev = E1; ... E2 = disk; ... - add_disk(E2); + device_add_disk(E1, E2); ...plus some manual fixups for a few missed conversions. Cc: Jens Axboe <axboe@fb.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
9dec4892 |
|
22-Apr-2016 |
Dan Williams <dan.j.williams@intel.com> |
libnvdimm, btt: add btt startup debug Report the reason for btt probe failures when debug is enabled. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
e32bc729 |
|
17-Mar-2016 |
Dan Williams <dan.j.williams@intel.com> |
libnvdimm, btt, convert nd_btt_probe() to devm Pass the device performing the probe so we can use a devm allocation for the btt superblock. Cc: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
298f2bc5 |
|
15-Mar-2016 |
Dan Williams <dan.j.williams@intel.com> |
libnvdimm, pmem: kill pmem->ndns We can derive the common namespace from other information. We also do not need to cache it because all the usages are in slow paths. Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
09cbfeaf |
|
01-Apr-2016 |
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> |
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
ff8e92d5 |
|
09-Mar-2016 |
NeilBrown <neilb@suse.com> |
nvdimm/btt: don't allocate unused major device number alloc_disk(0) does not require or use a ->major number, all devices are allocated with a major of BLOCK_EXT_MAJOR. So don't allocate btt_major. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
dece1635 |
|
05-Nov-2015 |
Jens Axboe <axboe@fb.com> |
block: change ->make_request_fn() and users to return a queue cookie No functional changes in this patch, but it prepares us for returning a more useful cookie related to the IO that was queued up. Signed-off-by: Jens Axboe <axboe@fb.com> Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Keith Busch <keith.busch@intel.com>
|
#
9609b994 |
|
21-Oct-2015 |
Dan Williams <dan.j.williams@intel.com> |
md, dm, scsi, nvme, libnvdimm: drop blk_integrity_unregister() at shutdown Now that the integrity profile is statically allocated there is no work to do when shutting down an integrity enabled block device. Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: James Bottomley <JBottomley@Odin.com> Acked-by: NeilBrown <neilb@suse.com> Acked-by: Keith Busch <keith.busch@intel.com> Acked-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
e1455744 |
|
30-Jul-2015 |
Dan Williams <dan.j.williams@intel.com> |
libnvdimm, pfn: 'struct page' provider infrastructure Implement the base infrastructure for libnvdimm PFN devices. Similar to BTT devices they take a namespace as a backing device and layer functionality on top. In this case the functionality is reserving space for an array of 'struct page' entries to be handed out through pfn_to_page(). For now this is just the basic libnvdimm-device-model for configuring the base PFN device. As the namespace claiming mechanism for PFN devices is mostly identical to BTT devices drivers/nvdimm/claim.c is created to house the common bits. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
6ec68954 |
|
29-Jul-2015 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: write and validate parent_uuid When a BTT is instantiated on a namespace it must validate the namespace uuid matches the 'parent_uuid' stored in the btt superblock. This property enforces that changing the namespace UUID invalidates all former BTT instances on that storage. For "IO namespaces" that don't have a label or UUID, the parent_uuid is set to zero, and this validation is skipped. For such cases, old BTTs have to be invalidated by forcing the namespace to raw mode, and overwriting the BTT info blocks. Based on a patch by Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
ab45e763 |
|
29-Jul-2015 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: consolidate arena validation Use arena_is_valid as a common routine for checking the validity of an info block from both discover_arenas, and nd_btt_probe. As a result, don't check for validity of the BTT's UUID, and lbasize. The checksum in the BTT info block guarantees self-consistency, and when we're called from nd_btt_probe, we don't have a valid uuid or lbasize available to check against. Also cleanup to return a bool instead of an int. Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
fbde1414 |
|
29-Jul-2015 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: clean up internal interfaces Consolidate the parameters passed to arena_is_valid into just nd_btt, and an info block to increase re-usability. Similarly, btt_arena_write_layout doesn't need to be passed a uuid, as it can be obtained from arena->nd_btt. Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
4246a0b6 |
|
20-Jul-2015 |
Christoph Hellwig <hch@lst.de> |
block: add a bi_error field to struct bio Currently we have two different ways to signal an I/O error on a BIO: (1) by clearing the BIO_UPTODATE flag (2) by returning a Linux errno value to the bi_end_io callback The first one has the drawback of only communicating a single possible error (-EIO), and the second one has the drawback of not beeing persistent when bios are queued up, and are not passed along from child to parent bio in the ever more popular chaining scenario. Having both mechanisms available has the additional drawback of utterly confusing driver authors and introducing bugs where various I/O submitters only deal with one of them, and the others have to add boilerplate code to deal with both kinds of error returns. So add a new bi_error field to store an errno value directly in struct bio and remove the existing mechanisms to clean all this up. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
5e329406 |
|
11-Jul-2015 |
Dan Williams <dan.j.williams@intel.com> |
libnvdimm, btt: sparse fix Fix: drivers/nvdimm/btt.c:635:29: warning: restricted __le64 degrades to integer Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
58138820 |
|
23-Jun-2015 |
Dan Williams <dan.j.williams@intel.com> |
libnvdimm, nfit: handle unarmed dimms, mark namespaces read-only Upon detection of an unarmed dimm in a region, arrange for descendant BTT, PMEM, or BLK instances to be read-only. A dimm is primarily marked "unarmed" via flags passed by platform firmware (NFIT). The flags in the NFIT memory device sub-structure indicate the state of the data on the nvdimm relative to its energy source or last "flush to persistence". For the most part there is nothing the driver can do but advertise the state of these flags in sysfs and emit a message if firmware indicates that the contents of the device may be corrupted. However, for the case of ACPI_NFIT_MEM_ARMED, the driver can arrange for the block devices incorporating that nvdimm to be marked read-only. This is a safe default as the data is still available and new writes are held off until the administrator either forces read-write mode, or the energy source becomes armed. A 'read_only' attribute is added to REGION devices to allow for overriding the default read-only policy of all descendant block devices. Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
f0dc089c |
|
15-May-2015 |
Dan Williams <dan.j.williams@intel.com> |
libnvdimm: enable iostat This is disabled by default as the overhead is prohibitive, but if the user takes the action to turn it on we'll oblige. Reviewed-by: Vishal Verma <vishal.l.verma@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
41cd8b70 |
|
25-Jun-2015 |
Vishal Verma <vishal.l.verma@intel.com> |
libnvdimm, btt: add support for blk integrity Support multiple block sizes (sector + metadata) using the blk integrity framework. This registers a new integrity template that defines the protection information tuple size based on the configured metadata size, and simply acts as a passthrough for protection information generated by another layer. The metadata is written to the storage as-is, and read back with each sector. Signed-off-by: Vishal Verma <vishal.l.verma@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
5212e11f |
|
25-Jun-2015 |
Vishal Verma <vishal.l.verma@intel.com> |
nd_btt: atomic sector updates BTT stands for Block Translation Table, and is a way to provide power fail sector atomicity semantics for block devices that have the ability to perform byte granularity IO. It relies on the capability of libnvdimm namespace devices to do byte aligned IO. The BTT works as a stacked blocked device, and reserves a chunk of space from the backing device for its accounting metadata. It is a bio-based driver because all IO is done synchronously, and there is no queuing or asynchronous completions at either the device or the driver level. The BTT uses 'lanes' to index into various 'on-disk' data structures, and lanes also act as a synchronization mechanism in case there are more CPUs than available lanes. We did a comparison between two lane lock strategies - first where we kept an atomic counter around that tracked which was the last lane that was used, and 'our' lane was determined by atomically incrementing that. That way, for the nr_cpus > nr_lanes case, theoretically, no CPU would be blocked waiting for a lane. The other strategy was to use the cpu number we're scheduled on to and hash it to a lane number. Theoretically, this could block an IO that could've otherwise run using a different, free lane. But some fio workloads showed that the direct cpu -> lane hash performed faster than tracking 'last lane' - my reasoning is the cache thrash caused by moving the atomic variable made that approach slower than simply waiting out the in-progress IO. This supports the conclusion that the driver can be a very simple bio-based one that does synchronous IOs instead of queuing. Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jens Axboe <axboe@fb.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Neil Brown <neilb@suse.de> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Greg KH <gregkh@linuxfoundation.org> [jmoyer: fix nmi watchdog timeout in btt_map_init] [jmoyer: move btt initialization to module load path] [jmoyer: fix memory leak in the btt initialization path] [jmoyer: Don't overwrite corrupted arenas] Signed-off-by: Vishal Verma <vishal.l.verma@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|