#
a28d893e |
|
23-Jan-2024 |
Christian Brauner <brauner@kernel.org> |
md: port block device access to file Link: https://lore.kernel.org/r/20240123-vfs-bdev-file-v2-4-adbd023e19cc@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
c2fce61f |
|
27-Sep-2023 |
Jan Kara <jack@suse.cz> |
dm: Convert to bdev_open_by_dev() Convert device mapper to use bdev_open_by_dev() and pass the handle around. CC: Alasdair Kergon <agk@redhat.com> CC: Mike Snitzer <snitzer@kernel.org> CC: dm-devel@redhat.com Acked-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230927093442.25915-10-jack@suse.cz Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
05bdb996 |
|
08-Jun-2023 |
Christoph Hellwig <hch@lst.de> |
block: replace fmode_t with a block-specific type for block open flags The only overlap between the block open flags mapped into the fmode_t and other uses of fmode_t are FMODE_READ and FMODE_WRITE. Define a new blk_mode_t instead for use in blkdev_get_by_{dev,path}, ->open and ->ioctl and stop abusing fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-28-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
d4a28d7d |
|
31-May-2023 |
Christoph Hellwig <hch@lst.de> |
dm: remove dm_get_dev_t Open code dm_get_dev_t in the only remaining caller, and propagate the exact error code from lookup_bdev and early_lookup_bdev. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230531125535.676098-20-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
f7995089 |
|
13-Apr-2023 |
Mike Snitzer <snitzer@kernel.org> |
dm: unexport dm_get_queue_limits() There are no dm_get_queue_limits() callers outside of DM core and there shouldn't be. Also, remove its BUG_ON(!atomic_read(&md->holders)) to micro-optimize __process_abnormal_io(). Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
13f6facf |
|
13-Apr-2023 |
Mike Snitzer <snitzer@kernel.org> |
dm: allow targets to require splitting WRITE_ZEROES and SECURE_ERASE Introduce max_write_zeroes_granularity and max_secure_erase_granularity flags in the dm_target struct. If a target sets these then DM core will split IO of these operation types accordingly (in terms of max_write_zeroes_sectors and max_secure_erase_sectors respectively). Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
3664ff82 |
|
09-Apr-2023 |
Yangtao Li <frank.li@vivo.com> |
dm: add helper macro for simple DM target module init and exit Eliminate duplicate boilerplate code for simple modules that contain a single DM target driver without any additional setup code. Add a new module_dm() macro, which replaces the module_init() and module_exit() with template functions that call dm_register_target() and dm_unregister_target() respectively. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
06961c48 |
|
28-Feb-2023 |
Mike Snitzer <snitzer@kernel.org> |
dm: split discards further if target sets max_discard_granularity The block core (bio_split_discard) will already split discards based on the 'discard_granularity' and 'max_discard_sectors' queue_limits. But the DM thin target also needs to ensure that it doesn't receive a discard that spans a 'max_discard_sectors' boundary. Introduce a dm_target 'max_discard_granularity' flag that if set will cause DM core to split discard bios relative to 'max_discard_sectors'. This treats 'discard_granularity' as a "min_discard_granularity" and 'max_discard_sectors' as a "max_discard_granularity". Requested-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
a4a82ce3 |
|
26-Jan-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: correct block comments format. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
44bc08ed |
|
01-Feb-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: enclose complex macros into parentheses where possible Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
86a3238c |
|
25-Jan-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: change "unsigned" to "unsigned int" Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
3bd94003 |
|
25-Jan-2023 |
Heinz Mauelshagen <heinzm@redhat.com> |
dm: add missing SPDX-License-Indentifiers 'GPL-2.0-only' is used instead of 'GPL-2.0' because SPDX has deprecated its use. Suggested-by: John Wiele <jwiele@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
9dd1cd32 |
|
20-Jul-2022 |
Mike Snitzer <snitzer@kernel.org> |
dm: fix dm-raid crash if md_handle_request() splits bio Commit ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone") introduced the optimization to _not_ perform bio_associate_blkg()'s relatively costly work when DM core clones its bio. But in doing so it exposed the possibility for DM's cloned bio to alter DM target behavior (e.g. crash) if a target were to issue IO without first calling bio_set_dev(). The DM raid target can trigger an MD crash due to its need to split the DM bio that is passed to md_handle_request(). The split will recurse to submit_bio_noacct() using a bio with an uninitialized ->bi_blkg. This NULL bio->bi_blkg causes blk_throtl_bio() to dereference a NULL blkg_to_tg(bio->bi_blkg). Fix this in DM core by adding a new 'needs_bio_set_dev' target flag that will make alloc_tio() call bio_set_dev() on behalf of the target. dm-raid is the only target that requires this flag. bio_set_dev() initializes the DM cloned bio's ->bi_blkg, using bio_associate_blkg, before passing the bio to md_handle_request(). Long-term fix would be to audit and refactor MD code to rely on DM to split its bio, using dm_accept_partial_bio(), but there are MD raid personalities (e.g. raid1 and raid10) whose implementation are tightly coupled to handling the bio splitting inline. Fixes: ca522482e3eaf ("dm: pass NULL bdev to bio_alloc_clone") Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
2aec377a |
|
05-Jul-2022 |
Mike Snitzer <snitzer@kernel.org> |
dm table: remove dm_table_get_num_targets() wrapper More efficient and readable to just access table->num_targets directly. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
#
047218ec |
|
22-Apr-2022 |
Jane Chu <jane.chu@oracle.com> |
dax: add .recovery_write dax_operation Introduce dax_recovery_write() operation. The function is used to recover a dax range that contains poison. Typical use case is when a user process receives a SIGBUS with si_code BUS_MCEERR_AR indicating poison(s) in a dax range, in response, the user process issues a pwrite() to the page-aligned dax range, thus clears the poison and puts valid data in the range. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jane Chu <jane.chu@oracle.com> Link: https://lore.kernel.org/r/20220422224508.440670-6-jane.chu@oracle.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
e511c4a3 |
|
13-May-2022 |
Jane Chu <jane.chu@oracle.com> |
dax: introduce DAX_RECOVERY_WRITE dax access mode Up till now, dax_direct_access() is used implicitly for normal access, but for the purpose of recovery write, dax range with poison is requested. To make the interface clear, introduce enum dax_access_mode { DAX_ACCESS, DAX_RECOVERY_WRITE, } where DAX_ACCESS is used for normal dax access, and DAX_RECOVERY_WRITE is used for dax recovery write. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/165247982851.52965.11024212198889762949.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
b7f8dff0 |
|
10-Mar-2022 |
Mike Snitzer <snitzer@redhat.com> |
dm: simplify dm_sumbit_bio_remap interface Remove the from_wq argument from dm_sumbit_bio_remap(). Eliminates the need for dm_sumbit_bio_remap() callers to know whether they are calling for a workqueue or from the original dm_submit_bio(). Add map_task to dm_io struct, record the map_task in alloc_io and clear it after all target ->map() calls have completed. Update dm_sumbit_bio_remap to check if 'current' matches io->map_task rather than rely on passed 'from_rq' argument. This change really simplifies the chore of porting each DM target to using dm_sumbit_bio_remap() because there is no longer the risk of programming error by not completely knowing all the different contexts a particular method that calls dm_sumbit_bio_remap() might be used in. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
a773187e |
|
09-Feb-2022 |
Christoph Hellwig <hch@lst.de> |
scsi: dm: Remove WRITE_SAME support There are no more end-users of REQ_OP_WRITE_SAME left, so we can start deleting it. Link: https://lore.kernel.org/r/20220209082828.2629273-7-hch@lst.de Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
#
a8b9d116 |
|
26-Jan-2022 |
Tom Rix <trix@redhat.com> |
dm: cleanup double word in comment Remove the second 'a'. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
0fbb4d93 |
|
17-Feb-2022 |
Mike Snitzer <snitzer@redhat.com> |
dm: add dm_submit_bio_remap interface Where possible, switch from early bio-based IO accounting (at the time DM clones each incoming bio) to late IO accounting just before each remapped bio is issued to underlying device via submit_bio_noacct(). Allows more precise bio-based IO accounting for DM targets that use their own workqueues to perform additional processing of each bio in conjunction with their DM_MAPIO_SUBMITTED return from their map function. When a target is updated to use dm_submit_bio_remap() they must also set ti->accounts_remapped_io to true. Use xchg() in start_io_acct(), as suggested by Mikulas, to ensure each IO is only started once. The xchg race only happens if __send_duplicate_bios() sends multiple bios -- that case is reflected via tio->is_duplicate_bio. Given the niche nature of this race, it is best to avoid any xchg performance penalty for normal IO. For IO that was never submitted with dm_bio_submit_remap(), but the target completes the clone with bio_endio, accounting is started then ended and pending_io counter decremented. Reviewed-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
7ac5360c |
|
15-Dec-2021 |
Christoph Hellwig <hch@lst.de> |
dax: remove the copy_from_iter and copy_to_iter methods These methods indirect the actual DAX read/write path. In the end pmem uses magic flush and mc safe variants and fuse and dcssblk use plain ones while device mapper picks redirects to the underlying device. Add set_dax_nocache() and set_dax_nomc() APIs to control which copy routines are used to remove indirect call from the read/write fast path as well as a lot of boilerplate code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> [virtiofs] Link: https://lore.kernel.org/r/20211215084508.435401-5-hch@lst.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
cb77cb5a |
|
18-Oct-2021 |
Eric Biggers <ebiggers@google.com> |
blk-crypto: rename blk_keyslot_manager to blk_crypto_profile blk_keyslot_manager is misnamed because it doesn't necessarily manage keyslots. It actually does several different things: - Contains the crypto capabilities of the device. - Provides functions to control the inline encryption hardware. Originally these were just for programming/evicting keyslots; however, new functionality (hardware-wrapped keys) will require new functions here which are unrelated to keyslots. Moreover, device-mapper devices already (ab)use "keyslot_evict" to pass key eviction requests to their underlying devices even though device-mapper devices don't have any keyslots themselves (so it really should be "evict_key", not "keyslot_evict"). - Sometimes (but not always!) it manages keyslots. Originally it always did, but device-mapper devices don't have keyslots themselves, so they use a "passthrough keyslot manager" which doesn't actually manage keyslots. This hack works, but the terminology is unnatural. Also, some hardware doesn't have keyslots and thus also uses a "passthrough keyslot manager" (support for such hardware is yet to be upstreamed, but it will happen eventually). Let's stop having keyslot managers which don't actually manage keyslots. Instead, rename blk_keyslot_manager to blk_crypto_profile. This is a fairly big change, since for consistency it also has to update keyslot manager-related function names, variable names, and comments -- not just the actual struct name. However it's still a fairly straightforward change, as it doesn't change any actual functionality. Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC Reviewed-by: Mike Snitzer <snitzer@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20211018180453.40441-4-ebiggers@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
8ec45662 |
|
12-Jul-2021 |
Tushar Sugandhi <tusharsu@linux.microsoft.com> |
dm: update target status functions to support IMA measurement For device mapper targets to take advantage of IMA's measurement capabilities, the status functions for the individual targets need to be updated to handle the status_type_t case for value STATUSTYPE_IMA. Update status functions for the following target types, to log their respective attributes to be measured using IMA. 01. cache 02. crypt 03. integrity 04. linear 05. mirror 06. multipath 07. raid 08. snapshot 09. striped 10. verity For rest of the targets, handle the STATUSTYPE_IMA case by setting the measurement buffer to NULL. For IMA to measure the data on a given system, the IMA policy on the system needs to be updated to have the following line, and the system needs to be restarted for the measurements to take effect. /etc/ima/ima-policy measure func=CRITICAL_DATA label=device-mapper template=ima-buf The measurements will be reflected in the IMA logs, which are located at: /sys/kernel/security/integrity/ima/ascii_runtime_measurements /sys/kernel/security/integrity/ima/binary_runtime_measurements These IMA logs can later be consumed by various attestation clients running on the system, and send them to external services for attesting the system. The DM target data measured by IMA subsystem can alternatively be queried from userspace by setting DM_IMA_MEASUREMENT_FLAG with DM_TABLE_STATUS_CMD. Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
91ccbbac |
|
12-Jul-2021 |
Tushar Sugandhi <tusharsu@linux.microsoft.com> |
dm ima: measure data on table load DM configures a block device with various target specific attributes passed to it as a table. DM loads the table, and calls each target’s respective constructors with the attributes as input parameters. Some of these attributes are critical to ensure the device meets certain security bar. Thus, IMA should measure these attributes, to ensure they are not tampered with, during the lifetime of the device. So that the external services can have high confidence in the configuration of the block-devices on a given system. Some devices may have large tables. And a given device may change its state (table-load, suspend, resume, rename, remove, table-clear etc.) many times. Measuring these attributes each time when the device changes its state will significantly increase the size of the IMA logs. Further, once configured, these attributes are not expected to change unless a new table is loaded, or a device is removed and recreated. Therefore the clear-text of the attributes should only be measured during table load, and the hash of the active/inactive table should be measured for the remaining device state changes. Export IMA function ima_measure_critical_data() to allow measurement of DM device parameters, as well as target specific attributes, during table load. Compute the hash of the inactive table and store it for measurements during future state change. If a load is called multiple times, update the inactive table hash with the hash of the latest populated table. So that the correct inactive table hash is measured when the device transitions to different states like resume, remove, rename, etc. Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> # leak fix Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
018eca45 |
|
20-Jul-2021 |
Guoqing Jiang <jiangguoqing@kylinos.cn> |
block: move some macros to blkdev.h Move them (PAGE_SECTORS_SHIFT, PAGE_SECTORS and SECTOR_MASK) to the generic header file to remove redundancy. Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn> Link: https://lore.kernel.org/r/20210721025315.1729118-1-guoqing.jiang@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
bb37d772 |
|
25-May-2021 |
Damien Le Moal <damien.lemoal@wdc.com> |
dm: introduce zone append emulation For zoned targets that cannot support zone append operations, implement an emulation using regular write operations. If the original BIO submitted by the user is a zone append operation, change its clone into a regular write operation directed at the target zone write pointer position. To do so, an array of write pointer offsets (write pointer position relative to the start of a zone) is added to struct mapped_device. All operations that modify a sequential zone write pointer (writes, zone reset, zone finish and zone append) are intersepted in __map_bio() and processed using the new functions dm_zone_map_bio(). Detection of the target ability to natively support zone append operations is done from dm_table_set_restrictions() by calling the function dm_set_zones_restrictions(). A target that does not support zone append operation, either by explicitly declaring it using the new struct dm_target field zone_append_not_supported, or because the device table contains a non-zoned device, has its mapped device marked with the new flag DMF_ZONE_APPEND_EMULATED. The helper function dm_emulate_zone_append() is introduced to test a mapped device for this new flag. Atomicity of the zones write pointer tracking and updates is done using a zone write locking mechanism based on a bitmap. This is similar to the block layer method but based on BIOs rather than struct request. A zone write lock is taken in dm_zone_map_bio() for any clone BIO with an operation type that changes the BIO target zone write pointer position. The zone write lock is released if the clone BIO is failed before submission or when dm_zone_endio() is called when the clone BIO completes. The zone write lock bitmap of the mapped device, together with a bitmap indicating zone types (conv_zones_bitmap) and the write pointer offset array (zwp_offset) are allocated and initialized with a full device zone report in dm_set_zones_restrictions() using the function dm_revalidate_zones(). For failed operations that may have modified a zone write pointer, the zone write pointer offset is marked as invalid in dm_zone_endio(). Zones with an invalid write pointer offset are checked and the write pointer updated using an internal report zone operation when the faulty zone is accessed again by the user. All functions added for this emulation have a minimal overhead for zoned targets natively supporting zone append operations. Regular device targets are also not affected. The added code also does not impact builds with CONFIG_BLK_DEV_ZONED disabled by stubbing out all dm zone related functions. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
912e8875 |
|
25-May-2021 |
Damien Le Moal <damien.lemoal@wdc.com> |
dm: Introduce dm_report_zones() To simplify the implementation of the report_zones operation of a zoned target, introduce the function dm_report_zones() to set a target mapping start sector in struct dm_report_zones_args and call blkdev_report_zones(). This new function is exported and the report zones callback function dm_report_zones_cb() is not. dm-linear, dm-flakey and dm-crypt are modified to use dm_report_zones(). Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
7a35693a |
|
07-Apr-2021 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
dm: replace dm_vcalloc() Use kvcalloc or kvmalloc_array instead (depending whether zeroing is useful). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
2d669ceb |
|
15-Mar-2021 |
Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> |
dm table: Fix zoned model check and zone sectors check Commit 24f6b6036c9e ("dm table: fix zoned iterate_devices based device capability checks") triggered dm table load failure when dm-zoned device is set up for zoned block devices and a regular device for cache. The commit inverted logic of two callback functions for iterate_devices: device_is_zoned_model() and device_matches_zone_sectors(). The logic of device_is_zoned_model() was inverted then all destination devices of all targets in dm table are required to have the expected zoned model. This is fine for dm-linear, dm-flakey and dm-crypt on zoned block devices since each target has only one destination device. However, this results in failure for dm-zoned with regular cache device since that target has both regular block device and zoned block devices. As for device_matches_zone_sectors(), the commit inverted the logic to require all zoned block devices in each target have the specified zone_sectors. This check also fails for regular block device which does not have zones. To avoid the check failures, fix the zone model check and the zone sectors check. For zone model check, introduce the new feature flag DM_TARGET_MIXED_ZONED_MODEL, and set it to dm-zoned target. When the target has this flag, allow it to have destination devices with any zoned model. For zone sectors check, skip the check if the destination device is not a zoned block device. Also add comments and improve an error message to clarify expectations to the two checks. Fixes: 24f6b6036c9e ("dm table: fix zoned iterate_devices based device capability checks") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
a666e5c0 |
|
10-Feb-2021 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: fix deadlock when swapping to encrypted device The system would deadlock when swapping to a dm-crypt device. The reason is that for each incoming write bio, dm-crypt allocates memory that holds encrypted data. These excessive allocations exhaust all the memory and the result is either deadlock or OOM trigger. This patch limits the number of in-flight swap bios, so that the memory consumed by dm-crypt is limited. The limit is enforced if the target set the "limit_swap_bios" variable and if the bio has REQ_SWAP set. Non-swap bios are not affected becuase taking the semaphore would cause performance degradation. This is similar to request-based drivers - they will also block when the number of requests is over the limit. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
e3290b94 |
|
10-Feb-2021 |
Mike Snitzer <snitzer@redhat.com> |
dm: simplify target code conditional on CONFIG_BLK_DEV_ZONED Allow removal of CONFIG_BLK_DEV_ZONED conditionals in target_type definition of various targets. Suggested-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
aa6ce87a |
|
31-Jan-2021 |
Satya Tangirala <satyat@google.com> |
dm: add support for passing through inline crypto support Update the device-mapper core to support exposing the inline crypto support of the underlying device(s) through the device-mapper device. This works by creating a "passthrough keyslot manager" for the dm device, which declares support for encryption settings which all underlying devices support. When a supported setting is used, the bio cloning code handles cloning the crypto context to the bios for all the underlying devices. When an unsupported setting is used, the blk-crypto fallback is used as usual. Crypto support on each underlying device is ignored unless the corresponding dm target opts into exposing it. This is needed because for inline crypto to semantically operate on the original bio, the data must not be transformed by the dm target. Thus, targets like dm-linear can expose crypto support of the underlying device, but targets like dm-crypt can't. (dm-crypt could use inline crypto itself, though.) A DM device's table can only be changed if the "new" inline encryption capabilities are a (*not* necessarily strict) superset of the "old" inline encryption capabilities. Attempts to make changes to the table that result in some inline encryption capability becoming no longer supported will be rejected. For the sake of clarity, key eviction from underlying devices will be handled in a future patch. Co-developed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Satya Tangirala <satyat@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
9c37de29 |
|
07-Oct-2020 |
Mike Snitzer <snitzer@redhat.com> |
dm: remove special-casing of bio-based immutable singleton target on NVMe Since commit 5a6c35f9af416 ("block: remove direct_make_request") there is no benefit to DM special-casing NVMe. Remove all code used to establish DM_TYPE_NVME_BIO_BASED. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
6abc4946 |
|
23-Sep-2020 |
Konstantin Khlebnikov <koct9i@gmail.com> |
dm: add support for REQ_NOWAIT and enable it for linear target Add DM target feature flag DM_TARGET_NOWAIT which advertises that target works with REQ_NOWAIT bios. Add dm_table_supports_nowait() and update dm_table_set_restrictions() to set/clear QUEUE_FLAG_NOWAIT accordingly. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
5df96f2b |
|
23-Jul-2020 |
Mikulas Patocka <mpatocka@redhat.com> |
dm integrity: fix integrity recalculation that is improperly skipped Commit adc0daad366b62ca1bce3e2958a40b0b71a8b8b3 ("dm: report suspended device during destroy") broke integrity recalculation. The problem is dm_suspended() returns true not only during suspend, but also during resume. So this race condition could occur: 1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work) 2. integrity_recalc (&ic->recalc_work) preempts the current thread 3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret; 4. integrity_recalc exits and no recalculating is done. To fix this race condition, add a function dm_post_suspending that is only true during the postsuspend phase and use it instead of dm_suspended(). Signed-off-by: Mikulas Patocka <mpatocka redhat com> Fixes: adc0daad366b ("dm: report suspended device during destroy") Cc: stable vger kernel org # v4.18+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
21cf8661 |
|
01-Jul-2020 |
Christoph Hellwig <hch@lst.de> |
writeback: remove bdi->congested_fn Except for pktdvd, the only places setting congested bits are file systems that allocate their own backing_dev_info structures. And pktdvd is a deprecated driver that isn't useful in stack setup either. So remove the dead congested_fn stacking infrastructure. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Song Liu <song@kernel.org> Acked-by: David Sterba <dsterba@suse.com> [axboe: fixup unused variables in bcache/request.c] Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
74244b59 |
|
14-May-2020 |
Hannes Reinecke <hare@suse.de> |
dm: use dynamic debug instead of compile-time config option Switch to use dynamic debug to avoid having recompile the kernel just to enable debugging messages. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
087615bf |
|
30-Apr-2020 |
Gabriel Krisman Bertazi <krisman@collabora.com> |
dm mpath: pass IO start time to path selector The HST path selector needs this information to perform path prediction. For request-based mpath, struct request's io_start_time_ns is used, while for bio-based, use the start_time stored in dm_io. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
cdf6cdcd |
|
28-Feb-2020 |
Vivek Goyal <vgoyal@redhat.com> |
dm,dax: Add dax zero_page_range operation This patch adds support for dax zero_page_range operation to dm targets. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Mike Snitzer <snitzer@redhat.com> Link: https://lore.kernel.org/r/20200228163456.1587-5-vgoyal@redhat.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
d4100351 |
|
10-Nov-2019 |
Christoph Hellwig <hch@lst.de> |
block: rework zone reporting Avoid the need to allocate a potentially large array of struct blk_zone in the block layer by switching the ->report_zones method interface to a callback model. Now the caller simply supplies a callback that is executed on each reported zone, and private data for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
8adeac3b |
|
02-Oct-2019 |
Gustavo A. R. Silva <gustavo@embeddedor.com> |
dm stripe: use struct_size() in kmalloc() One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct stripe_c { ... struct stripe stripe[0]; }; In this case alloc_context() and dm_array_too_big() are removed and replaced by the direct use of the struct_size() helper in kmalloc(). Notice that open-coded form is prone to type mistakes. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
733232f8 |
|
16-Jul-2019 |
Mike Snitzer <snitzer@redhat.com> |
dm: use printk ratelimiting functions DM provided its own ratelimiting printk wrapper but given printk advances this is no longer needed. Also, switching DMDEBUG_LIMIT to using pr_debug_ratelimited() fixes the reported issue where DMDEBUG_LIMIT() still caused a flood of "callbacks suppressed" messages. Reported-by: Milan Broz <gmazyland@gmail.com> Depends-on: 29fc2bc7539386 ("printk: pr_debug_ratelimited: check state first to reduce "callbacks suppressed" messages") Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
bd976e52 |
|
30-Jun-2019 |
Damien Le Moal <damien.lemoal@wdc.com> |
block: Kill gfp_t argument of blkdev_report_zones() Only GFP_KERNEL and GFP_NOIO are used with blkdev_report_zones(). In preparation of using vmalloc() for large report buffer and zone array allocations used by this function, remove its "gfp_t gfp_mask" argument and rely on the caller context to use memalloc_noio_save/restore() where necessary (block layer zone revalidation and dm-zoned I/O error path). Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
5de719e3 |
|
24-Apr-2019 |
Yufen Yu <yuyufen@huawei.com> |
dm mpath: fix missing call of path selector type->end_io After commit 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback"), map_request() will requeue the tio when issued clone request return BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE. Thus, if device driver status is error, a tio may be requeued multiple times until the return value is not DM_MAPIO_REQUEUE. That means type->start_io may be called multiple times, while type->end_io is only called when IO complete. In fact, even without commit 396eaf21ee17, setup_clone() failure can also cause tio requeue and associated missed call to type->end_io. The service-time path selector selects path based on in_flight_size, which is increased by st_start_io() and decreased by st_end_io(). Missed calls to st_end_io() can lead to in_flight_size count error and will cause the selector to make the wrong choice. In addition, queue-length path selector will also be affected. To fix the problem, call type->end_io in ->release_clone_rq before tio requeue. map_info is passed to ->release_clone_rq() for map_request() error path that result in requeue. Fixes: 396eaf21ee17 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback") Cc: stable@vger.kernl.org Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
6bbc923d |
|
21-Feb-2019 |
Helen Koike <helen.koike@collabora.com> |
dm: add support to directly boot to a mapped device Add a "create" module parameter, which allows device-mapper targets to be configured at boot time. This enables early use of DM targets in the boot process (as the root device or otherwise) without the need of an initramfs. The syntax used in the boot param is based on the concise format from the dmsetup tool to follow the rule of least surprise: dmsetup table --concise /dev/mapper/lroot Which is: dm-mod.create=<name>,<uuid>,<minor>,<flags>,<table>[,<table>+][;<name>,<uuid>,<minor>,<flags>,<table>[,<table>+]+] Where, <name> ::= The device name. <uuid> ::= xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | "" <minor> ::= The device minor number | "" <flags> ::= "ro" | "rw" <table> ::= <start_sector> <num_sectors> <target_type> <target_args> <target_type> ::= "verity" | "linear" | ... For example, the following could be added in the boot parameters: dm-mod.create="lroot,,,rw, 0 4096 linear 98:16 0, 4096 4096 linear 98:32 0" root=/dev/dm-0 Only the targets that were tested are allowed and the ones that don't change any block device when the device is create as read-only. For example, mirror and cache targets are not allowed. The rationale behind this is that if the user makes a mistake, choosing the wrong device to be the mirror or the cache can corrupt data. The only targets initially allowed are: * crypt * delay * linear * snapshot-origin * striped * verity Co-developed-by: Will Drewry <wad@chromium.org> Co-developed-by: Kees Cook <keescook@chromium.org> Co-developed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Helen Koike <helen.koike@collabora.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
0bdb50c5 |
|
06-Jan-2019 |
NeilBrown <neil@brown.name> |
dm: fix to_sector() for 32bit A dm-raid array with devices larger than 4GB won't assemble on a 32 bit host since _check_data_dev_sectors() was added in 4.16. This is because to_sector() treats its argument as an "unsigned long" which is 32bits (4GB) on a 32bit host. Using "unsigned long long" is more correct. Kernels as early as 4.2 can have other problems due to to_sector() being used on the size of a device. Fixes: 0cf4503174c1 ("dm raid: add support for the MD RAID0 personality") cc: stable@vger.kernel.org (v4.2+) Reported-and-tested-by: Guillaume Perréal <gperreal@free.fr> Signed-off-by: NeilBrown <neil@brown.name> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
61697a6a |
|
18-Jan-2019 |
Mike Snitzer <snitzer@redhat.com> |
dm: eliminate 'split_discard_bios' flag from DM target interface There is no need to have DM core split discards on behalf of a DM target now that blk_queue_split() handles splitting discards based on the queue_limits. A DM target just needs to set max_discard_sectors, discard_granularity, etc, in queue_limits. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
e76239a3 |
|
12-Oct-2018 |
Christoph Hellwig <hch@lst.de> |
block: add a report_zones method Dispatching a report zones command through the request queue is a major pain due to the command reply payload rewriting necessary. Given that blkdev_report_zones() is executing everything synchronously, implement report zones as a block device file operation instead, allowing major simplification of the code in many places. sd, null-blk, dm-linear and dm-flakey being the only block device drivers supporting exposing zoned block devices, these drivers are modified to provide the device side implementation of the report_zones() block device file operation. For device mappers, a new report_zones() target type operation is defined so that the upper block layer calls blkdev_report_zones() can be propagated down to the underlying devices of the dm targets. Implementation for this new operation is added to the dm-linear and dm-flakey targets. Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de> [Damien] * Changed method block_device argument to gendisk * Various bug fixes and improvements * Added support for null_blk, dm-linear and dm-flakey. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
f349b0a3 |
|
09-Oct-2018 |
Michał Mirosław <mirq-linux@rere.qmqm.pl> |
dm: add dm_table_device_name() Add a shortcut for dm_device_name(dm_table_get_md(t)). Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
953923c0 |
|
11-Oct-2018 |
Mike Snitzer <snitzer@redhat.com> |
dm: rename DM_TYPE_MQ_REQUEST_BASED to DM_TYPE_REQUEST_BASED Now that request-based DM is only using blk-mq, there is no need to differentiate between legacy "rq" and new "mq". We're back to a single request-based DM -- and there was much rejoicing! Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
b3a9a0c3 |
|
02-May-2018 |
Dan Williams <dan.j.williams@intel.com> |
dax: Introduce a ->copy_to_iter dax operation Similar to the ->copy_from_iter() operation, a platform may want to deploy an architecture or device specific routine for handling reads from a dax_device like /dev/pmemX. On x86 this routine will point to a machine check safe version of copy_to_iter(). For now, add the plumbing to device-mapper and the dax core. Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
5bd5e8d8 |
|
03-Apr-2018 |
Mike Snitzer <snitzer@redhat.com> |
dm: remove fmode_t argument from .prepare_ioctl hook Use the fmode_t that is passed to dm_blk_ioctl() rather than inconsistently (varies across targets) drop it on the floor by overriding it with the fmode_t stored in 'struct dm_dev'. All the persistent reservation functions weren't using the fmode_t they got back from .prepare_ioctl so remove them. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
00716545 |
|
13-Mar-2018 |
Denis Semakin <d.semakin@omprussia.ru> |
dm: add support for secure erase forwarding Set QUEUE_FLAG_SECERASE in DM device's queue_flags if a DM table's data devices support secure erase. Also, add support for secure erase to both the linear and striped targets. Signed-off-by: Denis Semakin <d.semakin@omprussia.ru> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
1eb5fa84 |
|
28-Feb-2018 |
Mike Snitzer <snitzer@redhat.com> |
dm: allow targets to return output from messages they are sent Could be useful for a target to return stats or other information. If a target does DMEMIT() anything to @result from its .message method then it must return 1 to the caller. Signed-off-By: Mike Snitzer <snitzer@redhat.com>
|
#
233bde21 |
|
14-Mar-2018 |
Bart Van Assche <bvanassche@acm.org> |
block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> It happens often while I'm preparing a patch for a block driver that I'm wondering: is a definition of SECTOR_SIZE and/or SECTOR_SHIFT available for this driver? Do I have to introduce definitions of these constants before I can use these constants? To avoid this confusion, move the existing definitions of SECTOR_SIZE and SECTOR_SHIFT into the <linux/blkdev.h> header file such that these become available for all block drivers. Make the SECTOR_SIZE definition in the uapi msdos_fs.h header file conditional to avoid that including that header file after <linux/blkdev.h> causes the compiler to complain about a SECTOR_SIZE redefinition. Note: the SECTOR_SIZE / SECTOR_SHIFT / SECTOR_BITS definitions have not been removed from uapi header files nor from NAND drivers in which these constants are used for another purpose than converting block layer offsets and sizes into a number of sectors. Cc: David S. Miller <davem@davemloft.net> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
ac514ffc |
|
12-Jan-2018 |
Mike Snitzer <snitzer@redhat.com> |
dm mpath: delay the retry of a request if the target responded as busy Add DM_ENDIO_DELAY_REQUEUE to allow request-based multipath's multipath_end_io() to instruct dm-rq.c:dm_done() to delay a requeue. This is beneficial to do if BLK_STS_RESOURCE is returned from the target (because target is busy). Relative to blk-mq: kick the hw queues via blk_mq_requeue_work(), indirectly from dm-rq.c:__dm_mq_kick_requeue_list(), after a delay. For old .request_fn: use blk_delay_queue(). bio-based multipath doesn't have feature parity with request-based for retryable error requeues; that is something that'll need fixing in the future. Suggested-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Bart Van Assche <bart.vanassche@wdc.com> [as interpreted from Bart's "... patch looks fine to me."]
|
#
f6e7baad |
|
28-Mar-2017 |
Brian Norris <briannorris@chromium.org> |
dm: move dm_table_destroy() to same header as dm_table_create() If anyone is going to use dm_table_create(), they probably should be able to use dm_table_destroy() too. Move the dm_table_destroy() definition outside the private header, near dm_table_create() Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
22c11858 |
|
04-Dec-2017 |
Mike Snitzer <snitzer@redhat.com> |
dm: introduce DM_TYPE_NVME_BIO_BASED If dm_table_determine_type() establishes DM_TYPE_NVME_BIO_BASED then all devices in the DM table do not support partial completions. Also, the table has a single immutable target that doesn't require DM core to split bios. This will enable adding NVMe optimizations to bio-based DM. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
64f52b0e |
|
11-Dec-2017 |
Mike Snitzer <snitzer@redhat.com> |
dm: improve performance by moving dm_io structure to per-bio-data Eliminates need for a separate mempool to allocate 'struct dm_io' objects from. As such, it saves an extra mempool allocation for each original bio that DM core is issued. This complicates the per-bio-data accessor functions by needing to conditonally add extra padding to get to a target's per-bio-data. But in the end this provides a decent performance improvement for all bio-based DM devices. On an NVMe-loop based testbed to a ramdisk (~3100 MB/s): bio-based DM linear performance improved by 2% (went from 2665 to 2777 MB/s). Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
f31c21e4 |
|
21-Nov-2017 |
NeilBrown <neilb@suse.com> |
dm: remove unused 'num_write_bios' target interface No DM target provides num_write_bios and none has since dm-cache's brief use in 2013. Having the possibility of num_write_bios > 1 complicates bio allocation. So remove the interface and assume there is only one bio needed. If a target ever needs more, it must provide a suitable bioset and allocate itself based on its particular needs. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
c3ca015f |
|
31-Aug-2017 |
Mikulas Patocka <mpatocka@redhat.com> |
dax: remove the pmem_dax_ops->flush abstraction Commit abebfbe2f731 ("dm: add ->flush() dax operation support") is buggy. A DM device may be composed of multiple underlying devices and all of them need to be flushed. That commit just routes the flush request to the first device and ignores the other devices. It could be fixed by adding more complex logic to the device mapper. But there is only one implementation of the method pmem_dax_ops->flush - that is pmem_dax_flush() - and it calls arch_wb_cache_pmem(). Consequently, we don't need the pmem_dax_ops->flush abstraction at all, we can call arch_wb_cache_pmem() directly from dax_flush() because dax_dev->ops->flush can't ever reach anything different from arch_wb_cache_pmem(). It should be also pointed out that for some uses of persistent memory it is needed to flush only a very small amount of data (such as 1 cacheline), and it would be overkill if we go through that device mapper machinery for a single flushed cache line. Fix this by removing the pmem_dax_ops->flush abstraction and call arch_wb_cache_pmem() directly from dax_flush(). Also, remove the device mapper code that forwards the flushes. Fixes: abebfbe2f731 ("dm: add ->flush() dax operation support") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
5916a22b |
|
22-Jun-2017 |
Eric Biggers <ebiggers@google.com> |
dm: constify argument arrays The arrays of 'struct dm_arg' are never modified by the device-mapper core, so constify them so that they are placed in .rodata. (Exception: the args array in dm-raid cannot be constified because it is allocated on the stack and modified.) Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
60440789 |
|
09-Aug-2017 |
Bart Van Assche <bvanassche@acm.org> |
dm: fix printk() rate limiting code Using the same rate limiting state for different kinds of messages is wrong because this can cause a high frequency message to suppress a report of a low frequency message. Hence use a unique rate limiting state per message type. Fixes: 71a16736a15e ("dm: use local printk ratelimit") Cc: stable@vger.kernel.org Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
10999307 |
|
08-May-2017 |
Damien Le Moal <damien.lemoal@wdc.com> |
dm: introduce dm_remap_zone_report() A target driver support zoned block devices and exposing it as such may receive REQ_OP_ZONE_REPORT request for the user to determine the mapped device zone configuration. To process properly such request, the target driver may need to remap the zone descriptors provided in the report reply. The helper function dm_remap_zone_report() does this generically using only the target start offset and length and the start offset within the target device. dm_remap_zone_report() will remap the start sector of all zones reported. If the report includes sequential zones, the write pointer position of these zones will also be remapped. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
dd88d313 |
|
08-May-2017 |
Damien Le Moal <damien.lemoal@wdc.com> |
dm table: add zoned block devices validation 1) Introduce DM_TARGET_ZONED_HM feature flag: The target drivers currently available will not operate correctly if a table target maps onto a host-managed zoned block device. To avoid problems, introduce the new feature flag DM_TARGET_ZONED_HM to allow a target to explicitly state that it supports host-managed zoned block devices. This feature is checked for all targets in a table if any of the table's block devices are host-managed. Note that as host-aware zoned block devices are backward compatible with regular block devices, they can be used by any of the current target types. This new feature is thus restricted to host-managed zoned block devices. 2) Check device area zone alignment: If a target maps to a zoned block device, check that the device area is aligned on zone boundaries to avoid problems with REQ_OP_ZONE_RESET operations (resetting a partially mapped sequential zone would not be possible). This also facilitates the processing of zone report with REQ_OP_ZONE_REPORT bios. 3) Check block devices zone model compatibility When setting the DM device's queue limits, several possibilities exists for zoned block devices: 1) The DM target driver may want to expose a different zone model (e.g. host-managed device emulation or regular block device on top of host-managed zoned block devices) 2) Expose the underlying zone model of the devices as-is To allow both cases, the underlying block device zone model must be set in the target limits in dm_set_device_limits() and the compatibility of all devices checked similarly to the logical block size alignment. For this last check, introduce validate_hardware_zoned_model() to check that all targets of a table have the same zone model and that the zone size of the target devices are equal. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> [Mike Snitzer refactored Damien's original work to simplify the code] Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
d2c3c8dc |
|
20-Apr-2017 |
Joe Perches <joe@perches.com> |
dm: convert DM printk macros to pr_<level> macros Using pr_<level> is the more common logging style. Standardize style and use new macro DM_FMT. Use no_printk in DMDEBUG macros when CONFIG_DM_DEBUG is not #defined. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
abebfbe2 |
|
29-May-2017 |
Dan Williams <dan.j.williams@intel.com> |
dm: add ->flush() dax operation support Allow device-mapper to route flush operations to the per-target implementation. In order for the device stacking to work we need a dax_dev and a pgoff relative to that device. This gives each layer of the stack the information it needs to look up the operation pointer for the next level. This conceptually allows for an array of mixed device drivers with varying flush implementations. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
7e026c8c |
|
29-May-2017 |
Dan Williams <dan.j.williams@intel.com> |
dm: add ->copy_from_iter() dax operation support Allow device-mapper to route copy_from_iter operations to the per-target implementation. In order for the device stacking to work we need a dax_dev and a pgoff relative to that device. This gives each layer of the stack the information it needs to look up the operation pointer for the next level. This conceptually allows for an array of mixed device drivers with varying copy_from_iter implementations. Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
4e4cbee9 |
|
03-Jun-2017 |
Christoph Hellwig <hch@lst.de> |
block: switch bios to blk_status_t Replace bi_error with a new bi_status to allow for a clear conversion. Note that device mapper overloaded bi_error with a private value, which we'll have to keep arround at least for now and thus propagate to a proper blk_status_t value. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
2a842aca |
|
03-Jun-2017 |
Christoph Hellwig <hch@lst.de> |
block: introduce new block status code type Currently we use nornal Linux errno values in the block layer, and while we accept any error a few have overloaded magic meanings. This patch instead introduces a new blk_status_t value that holds block layer specific status codes and explicitly explains their meaning. Helpers to convert from and to the previous special meanings are provided for now, but I suspect we want to get rid of them in the long run - those drivers that have a errno input (e.g. networking) usually get errnos that don't know about the special block layer overloads, and similarly returning them to userspace will usually return somethings that strictly speaking isn't correct for file system operations, but that's left as an exercise for later. For now the set of errors is a very limited set that closely corresponds to the previous overloaded errno values, but there is some low hanging fruite to improve it. blk_status_t (ab)uses the sparse __bitwise annotations to allow for sparse typechecking, so that we can easily catch places passing the wrong values. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
1be56909 |
|
03-Jun-2017 |
Christoph Hellwig <hch@lst.de> |
dm: change ->end_io calling convention Turn the error paramter into a pointer so that target drivers can change the value, and make sure only DM_ENDIO_* values are returned from the methods. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
412445ac |
|
26-Apr-2017 |
Christoph Hellwig <hch@lst.de> |
dm: introduce a new DM_MAPIO_KILL return value This untangles the DM_MAPIO_* values returned from ->clone_and_map_rq from the error codes used by the block layer. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
7ed8578a |
|
26-Apr-2017 |
Christoph Hellwig <hch@lst.de> |
dm rq: change ->rq_end_io calling conventions Instead of returning either a DM_ENDIO_* constant or an error code, add a new DM_ENDIO_DONE value that means keep errno as is. This allows us to easily keep the existing error code in case where we can't push back, and it also preparares for the new block level status codes with strict type checking. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
7e0d574f |
|
27-Apr-2017 |
Bart Van Assche <bvanassche@acm.org> |
dm: introduce enum dm_queue_mode to cleanup related code Introduce an enumeration type for the queue mode. This patch does not change any functionality but makes the DM code easier to read. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
817bf402 |
|
12-Apr-2017 |
Dan Williams <dan.j.williams@intel.com> |
dm: teach dm-targets to use a dax_device + dax_operations Arrange for dm to lookup the dax services available from member devices. Update the dax-capable targets, linear and stripe, to route dax operations to the underlying device. Changes the target-internal ->direct_access() method to more closely align with the dax_operations ->direct_access() calling convention. Cc: Toshi Kani <toshi.kani@hpe.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
e2460f2a |
|
18-Apr-2017 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: mark targets that pass integrity data A dm-crypt on dm-integrity device incorrectly advertises an integrity profile on the DM crypt device. It can be seen in the files "/sys/block/dm-*/integrity/*" that both dm-integrity and dm-crypt target advertise the integrity profile. That is incorrect, only the dm-integrity target should advertise the integrity profile. A general problem in DM is that if we have a DM device that depends on another device with an integrity profile, the upper device will always advertise the integrity profile, even when the target driver doesn't support handling integrity data. Most targets don't support integrity data, so we provide a whitelist of targets that support it (linear, delay and striped). The targets that support passing integrity data to the lower device are marked with the flag DM_TARGET_PASSES_INTEGRITY. The DM core will now advertise integrity data on a DM device only if all the targets support the integrity data. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
f26c5719 |
|
12-Apr-2017 |
Dan Williams <dan.j.williams@intel.com> |
dm: add dax_device and dax_operations support Allocate a dax_device to represent the capacity of a device-mapper instance. Provide a ->direct_access() method via the new dax_operations indirection that mirrors the functionality of the current direct_access support via block_device_operations. Once fs/dax.c has been converted to use dax_operations the old dm_blk_direct_access() will be removed. A new helper dm_dax_get_live_target() is introduced to separate some of the dm-specifics from the direct_access implementation. This enabling is only for the top-level dm representation to upper layers. Converting target direct_access implementations is deferred to a separate patch. Cc: Toshi Kani <toshi.kani@hpe.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
#
48920ff2 |
|
05-Apr-2017 |
Christoph Hellwig <hch@lst.de> |
block: remove the discard_zeroes_data flag Now that we use the proper REQ_OP_WRITE_ZEROES operation everywhere we can kill this hack. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
ac62d620 |
|
05-Apr-2017 |
Christoph Hellwig <hch@lst.de> |
dm: support REQ_OP_WRITE_ZEROES Copy & paste from the REQ_OP_WRITE_SAME code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
9b4b5a79 |
|
04-Jan-2017 |
Milan Broz <gmazyland@gmail.com> |
dm table: add flag to allow target to handle its own integrity metadata Add DM_TARGET_INTEGRITY flag that specifies bio integrity metadata is not inherited but implemented in the target itself. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
eb8db831 |
|
22-Jan-2017 |
Christoph Hellwig <hch@lst.de> |
dm: always defer request allocation to the owner of the request_queue DM already calls blk_mq_alloc_request on the request_queue of the underlying device if it is a blk-mq device. But now that we allow drivers to allocate additional data and initialize it ahead of time we need to do the same for all drivers. Doing so and using the new cmd_size infrastructure in the block layer greatly simplifies the dm-rq and mpath code, and should also make arbitrary combinations of SQ and MQ devices with SQ or MQ device mapper tables easily possible as a further step. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
a8ac51e4 |
|
09-Sep-2016 |
Mike Snitzer <snitzer@redhat.com> |
dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests Otherwise blk-mq will immediately dispatch requests that are requeued via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq. Delayed requeue is implemented using blk_mq_delay_kick_requeue_list() with a delay of 5 secs. In the context of DM multipath (all paths down) it doesn't make any sense to requeue more quickly. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
545ed20e |
|
22-Jun-2016 |
Toshi Kani <toshi.kani@hpe.com> |
dm: add infrastructure for DAX support Change mapped device to implement direct_access function, dm_blk_direct_access(), which calls a target direct_access function. 'struct target_type' is extended to have target direct_access interface. This function limits direct accessible size to the dm_target's limit with max_io_len(). Add dm_table_supports_dax() to iterate all targets and associated block devices to check for DAX support. To add DAX support to a DM target the target must only implement the direct_access function. Add a new dm type, DM_TYPE_DAX_BIO_BASED, which indicates that mapped device supports DAX and is bio based. This new type is used to assure that all target devices have DAX support and remain that way after QUEUE_FLAG_DAX is set in mapped device. At initial table load, QUEUE_FLAG_DAX is set to mapped device when setting DM_TYPE_DAX_BIO_BASED to the type. Any subsequent table load to the mapped device must have the same type, or else it fails per the check in table_load(). Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
e83068a5 |
|
24-May-2016 |
Mike Snitzer <snitzer@redhat.com> |
dm mpath: add optional "queue_mode" feature Allow a user to specify an optional feature 'queue_mode <mode>' where <mode> may be "bio", "rq" or "mq" -- which corresponds to bio-based, request_fn rq-based, and blk-mq rq-based respectively. If the queue_mode feature isn't specified the default for the "multipath" target is still "rq" but if dm_mod.use_blk_mq is set to Y it'll default to mode "mq". This new queue_mode feature introduces the ability for each multipath device to have its own queue_mode (whereas before this feature all multipath devices effectively had to have the same queue_mode). This commit also goes a long way to eliminate the awkward (ab)use of DM_TYPE_*, the associated filter_md_type() and other relatively fragile and difficult to maintain code. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
4df2bf46 |
|
01-Feb-2016 |
DingXiang <dingxiang@huawei.com> |
dm snapshot: disallow the COW and origin devices from being identical Otherwise loading a "snapshot" table using the same device for the origin and COW devices, e.g.: echo "0 20971520 snapshot 253:3 253:3 P 8" | dmsetup create snap will trigger: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 [ 1958.979934] IP: [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1958.989655] PGD 0 [ 1958.991903] Oops: 0000 [#1] SMP ... [ 1959.059647] CPU: 9 PID: 3556 Comm: dmsetup Tainted: G IO 4.5.0-rc5.snitm+ #150 ... [ 1959.083517] task: ffff8800b9660c80 ti: ffff88032a954000 task.ti: ffff88032a954000 [ 1959.091865] RIP: 0010:[<ffffffffa040efba>] [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.104295] RSP: 0018:ffff88032a957b30 EFLAGS: 00010246 [ 1959.110219] RAX: 0000000000000000 RBX: 0000000000000008 RCX: 0000000000000001 [ 1959.118180] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff880329334a00 [ 1959.126141] RBP: ffff88032a957b50 R08: 0000000000000000 R09: 0000000000000001 [ 1959.134102] R10: 000000000000000a R11: f000000000000000 R12: ffff880330884d80 [ 1959.142061] R13: 0000000000000008 R14: ffffc90001c13088 R15: ffff880330884d80 [ 1959.150021] FS: 00007f8926ba3840(0000) GS:ffff880333440000(0000) knlGS:0000000000000000 [ 1959.159047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1959.165456] CR2: 0000000000000098 CR3: 000000032f48b000 CR4: 00000000000006e0 [ 1959.173415] Stack: [ 1959.175656] ffffc90001c13040 ffff880329334a00 ffff880330884ed0 ffff88032a957bdc [ 1959.183946] ffff88032a957bb8 ffffffffa040f225 ffff880329334a30 ffff880300000000 [ 1959.192233] ffffffffa04133e0 ffff880329334b30 0000000830884d58 00000000569c58cf [ 1959.200521] Call Trace: [ 1959.203248] [<ffffffffa040f225>] dm_exception_store_create+0x1d5/0x240 [dm_snapshot] [ 1959.211986] [<ffffffffa040d310>] snapshot_ctr+0x140/0x630 [dm_snapshot] [ 1959.219469] [<ffffffffa0005c44>] ? dm_split_args+0x64/0x150 [dm_mod] [ 1959.226656] [<ffffffffa0005ea7>] dm_table_add_target+0x177/0x440 [dm_mod] [ 1959.234328] [<ffffffffa0009203>] table_load+0x143/0x370 [dm_mod] [ 1959.241129] [<ffffffffa00090c0>] ? retrieve_status+0x1b0/0x1b0 [dm_mod] [ 1959.248607] [<ffffffffa0009e35>] ctl_ioctl+0x255/0x4d0 [dm_mod] [ 1959.255307] [<ffffffff813304e2>] ? memzero_explicit+0x12/0x20 [ 1959.261816] [<ffffffffa000a0c3>] dm_ctl_ioctl+0x13/0x20 [dm_mod] [ 1959.268615] [<ffffffff81215eb6>] do_vfs_ioctl+0xa6/0x5c0 [ 1959.274637] [<ffffffff81120d2f>] ? __audit_syscall_entry+0xaf/0x100 [ 1959.281726] [<ffffffff81003176>] ? do_audit_syscall_entry+0x66/0x70 [ 1959.288814] [<ffffffff81216449>] SyS_ioctl+0x79/0x90 [ 1959.294450] [<ffffffff8167e4ae>] entry_SYSCALL_64_fastpath+0x12/0x71 ... [ 1959.323277] RIP [<ffffffffa040efba>] dm_exception_store_set_chunk_size+0x7a/0x110 [dm_snapshot] [ 1959.333090] RSP <ffff88032a957b30> [ 1959.336978] CR2: 0000000000000098 [ 1959.344121] ---[ end trace b049991ccad1169e ]--- Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1195899 Cc: stable@vger.kernel.org Signed-off-by: Ding Xiang <dingxiang@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
30187e1d |
|
31-Jan-2016 |
Mike Snitzer <snitzer@redhat.com> |
dm: rename target's per_bio_data_size to per_io_data_size Request-based DM will also make use of per_bio_data_size. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
f083b09b |
|
06-Feb-2016 |
Mike Snitzer <snitzer@redhat.com> |
dm: set DM_TARGET_WILDCARD feature on "error" target The DM_TARGET_WILDCARD feature indicates that the "error" target may replace any target; even immutable targets. This feature will be useful to preserve the ability to replace the "multipath" target even once it is formally converted over to having the DM_TARGET_IMMUTABLE feature. Also, implicit in the DM_TARGET_WILDCARD feature flag being set is that .map, .map_rq, .clone_and_map_rq and .release_clone_rq are all defined in the target_type. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
e56f81e0 |
|
15-Oct-2015 |
Christoph Hellwig <hch@lst.de> |
dm: refactor ioctl handling This moves the call to blkdev_ioctl and the argument checking to DM core code, and only leaves a callout to find the block device to operate on in the targets. This simplifies the code and allows us to pass through ioctl-like command using other methods in the next patch. Also split out a helper around calling the prepare_ioctl method that will be reused for persistent reservation handling. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
8ae12666 |
|
28-Apr-2015 |
Kent Overstreet <kent.overstreet@gmail.com> |
block: kill merge_bvec_fn() completely As generic_make_request() is now able to handle arbitrarily sized bios, it's no longer necessary for each individual block driver to define its own ->merge_bvec_fn() callback. Remove every invocation completely. Cc: Jens Axboe <axboe@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: drbd-user@lists.linbit.com Cc: Jiri Kosina <jkosina@suse.cz> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@kernel.org> Cc: ceph-devel@vger.kernel.org Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: Neil Brown <neilb@suse.de> Cc: linux-raid@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Acked-by: NeilBrown <neilb@suse.de> (for the 'md' bits) Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> [dpark: also remove ->merge_bvec_fn() in dm-thin as well as dm-era-target, and resolve merge conflicts] Signed-off-by: Dongsu Park <dpark@posteo.net> Signed-off-by: Ming Lin <ming.l@ssi.samsung.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
52b09914 |
|
23-Feb-2015 |
Mike Snitzer <snitzer@redhat.com> |
dm: remove unnecessary wrapper around blk_lld_busy There is no need for DM to export a wrapper around the already exported blk_lld_busy(). Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
09ee96b2 |
|
26-Feb-2015 |
Mikulas Patocka <mpatocka@redhat.com> |
dm snapshot: suspend merging snapshot when doing exception handover The "dm snapshot: suspend origin when doing exception handover" commit fixed a exception store handover bug associated with pending exceptions to the "snapshot-origin" target. However, a similar problem exists in snapshot merging. When snapshot merging is in progress, we use the target "snapshot-merge" instead of "snapshot-origin". Consequently, during exception store handover, we must find the snapshot-merge target and suspend its associated mapped_device. To avoid lockdep warnings, the target must be suspended and resumed without holding _origins_lock. Introduce a dm_hold() function that grabs a reference on a mapped_device, but unlike dm_get(), it doesn't crash if the device has the DMF_FREEING flag set, it returns an error in this case. In snapshot_resume() we grab the reference to the origin device using dm_hold() while holding _origins_lock (_origins_lock guarantees that the device won't disappear). Then we release _origins_lock, suspend the device and grab _origins_lock again. NOTE to stable@ people: When backporting to kernels 3.18 and older, use dm_internal_suspend and dm_internal_resume instead of dm_internal_suspend_fast and dm_internal_resume_fast. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
|
#
e5863d9a |
|
17-Dec-2014 |
Mike Snitzer <snitzer@redhat.com> |
dm: allocate requests in target when stacking on blk-mq devices For blk-mq request-based DM the responsibility of allocating a cloned request is transfered from DM core to the target type. Doing so enables the cloned request to be allocated from the appropriate blk-mq request_queue's pool (only the DM target, e.g. multipath, can know which block device to send a given cloned request to). Care was taken to preserve compatibility with old-style block request completion that requires request-based DM _not_ acquire the clone request's queue lock in the completion path. As such, there are now 2 different request-based DM target_type interfaces: 1) the original .map_rq() interface will continue to be used for non-blk-mq devices -- the preallocated clone request is passed in from DM core. 2) a new .clone_and_map_rq() and .release_clone_rq() will be used for blk-mq devices -- blk_get_request() and blk_put_request() are used respectively from these hooks. dm_table_set_type() was updated to detect if the request-based target is being stacked on blk-mq devices, if so DM_TYPE_MQ_REQUEST_BASED is set. DM core disallows switching the DM table's type after it is set. This means that there is no mixing of non-blk-mq and blk-mq devices within the same request-based DM table. [This patch was started by Keith and later heavily modified by Mike] Tested-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
dbf9782c |
|
16-Dec-2014 |
Mike Snitzer <snitzer@redhat.com> |
dm: remove exports for request-based interfaces without external callers Remove exports for dm_dispatch_request, dm_requeue_unmapped_request, and dm_kill_unmapped_request. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
d67ee213 |
|
28-Oct-2014 |
Mike Snitzer <snitzer@redhat.com> |
dm: add presuspend_undo hook to target_type The DM thin-pool target now must undo the changes performed during pool_presuspend() so introduce presuspend_undo hook in target_type. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Joe Thornber <ejt@redhat.com>
|
#
11f0431b |
|
03-Jun-2014 |
Mike Snitzer <snitzer@redhat.com> |
dm: remove symbol export for dm_set_device_limits There is no need for code other than DM core to use dm_set_device_limits so remove its EXPORT_SYMBOL_GPL. Also, cleanup a couple whitespace nits. Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
1dd40c3e |
|
14-Mar-2014 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: introduce dm_accept_partial_bio The function dm_accept_partial_bio allows the target to specify how many sectors of the current bio it will process. If the target only wants to accept part of the bio, it calls dm_accept_partial_bio and the DM core sends the rest of the data in next bio. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
9974fa2c |
|
28-Feb-2014 |
Mike Snitzer <snitzer@redhat.com> |
dm table: add dm_table_run_md_queue_async Introduce dm_table_run_md_queue_async() to run the request_queue of the mapped_device associated with a request-based DM table. Also add dm_md_get_queue() wrapper to extract the request_queue from a mapped_device. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
|
#
d70ab4fb |
|
03-Mar-2014 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: remove dm_get_mapinfo Remove dm_get_mapinfo() because no target uses it. Targets can allocate per-bio data using ti->per_bio_data_size, this is much more flexible than union map_info. Leave union map_info only for the request-based multipath target's use. Also delete the unused "unsigned long long ll" field of union map_info. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
|
#
f84cb8a4 |
|
18-Sep-2013 |
Mike Snitzer <snitzer@redhat.com> |
dm mpath: disable WRITE SAME if it fails Workaround the SCSI layer's problematic WRITE SAME heuristics by disabling WRITE SAME in the DM multipath device's queue_limits if an underlying device disabled it. The WRITE SAME heuristics, with both the original commit 5db44863b6eb ("[SCSI] sd: Implement support for WRITE SAME") and the updated commit 66c28f971 ("[SCSI] sd: Update WRITE SAME heuristics"), default to enabling WRITE SAME(10) even without successfully determining it is supported. After the first failed WRITE SAME the SCSI layer will disable WRITE SAME for the device (by setting sdkp->device->no_write_same which results in 'max_write_same_sectors' in device's queue_limits to be set to 0). When a device is stacked ontop of such a SCSI device any changes to that SCSI device's queue_limits do not automatically propagate up the stack. As such, a DM multipath device will not have its WRITE SAME support disabled. This causes the block layer to continue to issue WRITE SAME requests to the mpath device which causes paths to fail and (if mpath IO isn't configured to queue when no paths are available) it will result in actual IO errors to the upper layers. This fix doesn't help configurations that have additional devices stacked ontop of the mpath device (e.g. LVM created linear DM devices ontop). A proper fix that restacks all the queue_limits from the bottom of the device stack up will need to be explored if SCSI will continue to use this model of optimistically allowing op codes and then disabling them after they fail for the first time. Before this patch: EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null) device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121) device-mapper: multipath: XXX snitm debugging: failing WRITE SAME IO with error=-121 end_request: critical target error, dev dm-6, sector 528 dm-6: WRITE SAME failed. Manually zeroing. device-mapper: multipath: Failing path 8:112. end_request: I/O error, dev dm-6, sector 4616 dm-6: WRITE SAME failed. Manually zeroing. end_request: I/O error, dev dm-6, sector 4616 end_request: I/O error, dev dm-6, sector 5640 end_request: I/O error, dev dm-6, sector 6664 end_request: I/O error, dev dm-6, sector 7688 end_request: I/O error, dev dm-6, sector 524288 Buffer I/O error on device dm-6, logical block 65536 lost page write due to I/O error on dm-6 JBD2: Error -5 detected when updating journal superblock for dm-6-8. end_request: I/O error, dev dm-6, sector 524296 Aborting journal on device dm-6-8. end_request: I/O error, dev dm-6, sector 524288 Buffer I/O error on device dm-6, logical block 65536 lost page write due to I/O error on dm-6 JBD2: Error -5 detected when updating journal superblock for dm-6-8. # cat /sys/block/sdh/queue/write_same_max_bytes 0 # cat /sys/block/dm-6/queue/write_same_max_bytes 33553920 After this patch: EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null) device-mapper: multipath: XXX snitm debugging: got -EREMOTEIO (-121) device-mapper: multipath: XXX snitm debugging: WRITE SAME I/O failed with error=-121 end_request: critical target error, dev dm-6, sector 528 dm-6: WRITE SAME failed. Manually zeroing. # cat /sys/block/sdh/queue/write_same_max_bytes 0 # cat /sys/block/dm-6/queue/write_same_max_bytes 0 It should be noted that WRITE SAME support wasn't enabled in DM multipath until v3.10. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Cc: stable@vger.kernel.org # 3.10+
|
#
fd2ed4d2 |
|
16-Aug-2013 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: add statistics support Support the collection of I/O statistics on user-defined regions of a DM device. If no regions are defined no statistics are collected so there isn't any performance impact. Only bio-based DM devices are currently supported. Each user-defined region specifies a starting sector, length and step. Individual statistics will be collected for each step-sized area within the range specified. The I/O statistics counters for each step-sized area of a region are in the same format as /sys/block/*/stat or /proc/diskstats but extra counters (12 and 13) are provided: total time spent reading and writing in milliseconds. All these counters may be accessed by sending the @stats_print message to the appropriate DM device via dmsetup. The creation of DM statistics will allocate memory via kmalloc or fallback to using vmalloc space. At most, 1/4 of the overall system memory may be allocated by DM statistics. The admin can see how much memory is used by reading /sys/module/dm_mod/parameters/stats_current_allocated_bytes See Documentation/device-mapper/statistics.txt for more details. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
83d5e5b0 |
|
10-Jul-2013 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: optimize use SRCU and RCU This patch removes "io_lock" and "map_lock" in struct mapped_device and "holders" in struct dm_table and replaces these mechanisms with sleepable-rcu. Previously, the code would call "dm_get_live_table" and "dm_table_put" to get and release table. Now, the code is changed to call "dm_get_live_table" and "dm_put_live_table". dm_get_live_table locks sleepable-rcu and dm_put_live_table unlocks it. dm_get_live_table_fast/dm_put_live_table_fast can be used instead of dm_get_live_table/dm_put_live_table. These *_fast functions use non-sleepable RCU, so the caller must not block between them. If the code changes active or inactive dm table, it must call dm_sync_table before destroying the old table. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
058ce5ca |
|
10-May-2013 |
Alasdair G Kergon <agk@redhat.com> |
dm: document iterate_devices Document iterate_devices in device-mapper.h. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
b0d8ed4d |
|
01-Mar-2013 |
Alasdair G Kergon <agk@redhat.com> |
dm: add target num_write_bios fn Add a num_write_bios function to struct target. If an instance of a target sets this, it will be queried before the target's mapping function is called on a write bio, and the response controls the number of copies of the write bio that the target will receive. This provides a convenient way for a target to send the same data to more than one device. The new cache target uses this in writethrough mode, to send the data both to the cache and the backing device. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
55a62eef |
|
01-Mar-2013 |
Alasdair G Kergon <agk@redhat.com> |
dm: rename request variables to bios Use 'bio' in the name of variables and functions that deal with bios rather than 'request' to avoid confusion with the normal block layer use of 'request'. No functional changes. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
fd7c092e |
|
01-Mar-2013 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: fix truncated status strings Avoid returning a truncated table or status string instead of setting the DM_BUFFER_FULL_FLAG when the last target of a table fills the buffer. When processing a table or status request, the function retrieve_status calls ti->type->status. If ti->type->status returns non-zero, retrieve_status assumes that the buffer overflowed and sets DM_BUFFER_FULL_FLAG. However, targets don't return non-zero values from their status method on overflow. Most targets returns always zero. If a buffer overflow happens in a target that is not the last in the table, it gets noticed during the next iteration of the loop in retrieve_status; but if a buffer overflow happens in the last target, it goes unnoticed and erroneously truncated data is returned. In the current code, the targets behave in the following way: * dm-crypt returns -ENOMEM if there is not enough space to store the key, but it returns 0 on all other overflows. * dm-thin returns errors from the status method if a disk error happened. This is incorrect because retrieve_status doesn't check the error code, it assumes that all non-zero values mean buffer overflow. * all the other targets always return 0. This patch changes the ti->type->status function to return void (because most targets don't use the return code). Overflow is detected in retrieve_status: if the status method fills up the remaining space completely, it is assumed that buffer overflow happened. Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
7de3ee57 |
|
21-Dec-2012 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: remove map_info This patch removes map_info from bio-based device mapper targets. map_info is still used for request-based targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
ddbd658f |
|
21-Dec-2012 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: move target request nr to dm_target_io This patch moves target_request_nr from map_info to dm_target_io and makes it accessible with dm_bio_get_target_request_nr. This patch is a preparation for the next patch that removes map_info. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
c0820cf5 |
|
21-Dec-2012 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: introduce per_bio_data Introduce a field per_bio_data_size in struct dm_target. Targets can set this field in the constructor. If a target sets this field to a non-zero value, "per_bio_data_size" bytes of auxiliary data are allocated for each bio submitted to the target. These data can be used for any purpose by the target and help us improve performance by removing some per-target mempools. Per-bio data is accessed with dm_per_bio_data. The argument data_size must be the same as the value per_bio_data_size in dm_target. If the target has a pointer to per_bio_data, it can get a pointer to the bio with dm_bio_from_per_bio_data() function (data_size must be the same as the value passed to dm_per_bio_data). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
d54eaa5a |
|
21-Dec-2012 |
Mike Snitzer <snitzer@redhat.com> |
dm: prepare to support WRITE SAME Allow targets to opt in to WRITE SAME support by setting 'num_write_same_requests' in the dm_target structure. A dm device will only advertise WRITE SAME support if all its targets and all its underlying devices support it. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
1f4e0ff0 |
|
27-Jul-2012 |
Alasdair G Kergon <agk@redhat.com> |
dm thin: commit before gathering status Commit outstanding metadata before returning the status for a dm thin pool so that the numbers reported are as up-to-date as possible. The commit is not performed if the device is suspended or if the DM_NOFLUSH_FLAG is supplied by userspace and passed to the target through a new 'status_flags' parameter in the target's dm_status_fn. The userspace dmsetup tool will support the --noflush flag with the 'dmsetup status' and 'dmsetup wait' commands from version 1.02.76 onwards. Tested-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
0ac55489 |
|
27-Jul-2012 |
Alasdair G Kergon <agk@redhat.com> |
dm: use bool bitfields in struct dm_target Use boolean bit fields for flags in struct dm_target. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
0e9c24ed |
|
27-Jul-2012 |
Joe Thornber <ejt@redhat.com> |
dm: allow targets to request flushes regardless of underlying device support Allow targets to override the 'supports flush' calculation. Set 'flush_supported' if a target needs to receive flushes regardless of whether or not its underlying devices have support. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
7acf0277 |
|
27-Jul-2012 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: introduce split_discard_requests This patch introduces a new variable split_discard_requests. It can be set by targets so that discard requests are split on max_io_len boundaries. When split_discard_requests is not set, discard requests are only split on boundaries between targets, as was the case before this patch. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
542f9038 |
|
27-Jul-2012 |
Mike Snitzer <snitzer@redhat.com> |
dm: support non power of two target max_io_len Remove the restriction that limits a target's specified maximum incoming I/O size to be a power of 2. Rename this setting from 'split_io' to the less-ambiguous 'max_io_len'. Change it from sector_t to uint32_t, which is plenty big enough, and introduce a wrapper function dm_set_target_max_io_len() to set it. Use sector_div() to process it now that it is not necessarily a power of 2. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
af7346eb |
|
27-Jul-2012 |
Joe Thornber <ejt@redhat.com> |
dm: remove unused flush target method Remove unused dm_flush_fn .flush target method from header. This was left-over from the FLUSH/FUA conversion and is no longer used. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
36a0456f |
|
31-Oct-2011 |
Alasdair G Kergon <agk@redhat.com> |
dm table: add immutable feature Introduce DM_TARGET_IMMUTABLE to indicate that the target type cannot be mixed with any other target type, and once loaded into a device, it cannot be replaced with a table containing a different type. The thin provisioning pool device will use this. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
cc6cbe14 |
|
31-Oct-2011 |
Alasdair G Kergon <agk@redhat.com> |
dm table: add always writeable feature Add a target feature flag DM_TARGET_ALWAYS_WRITEABLE to indicate that a target does not support read-only mode. The initial implementation of the thin provisioning target uses this. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
3791e2fc |
|
31-Oct-2011 |
Alasdair G Kergon <agk@redhat.com> |
dm table: add singleton feature Introduce the concept of a singleton table which contains exactly one target. If a target type sets the DM_TARGET_SINGLETON feature bit device-mapper will ensure that any table that includes that target contains no others. The thin provisioning pool target uses this. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
71a16736 |
|
31-Oct-2011 |
Namhyung Kim <namhyung@gmail.com> |
dm: use local printk ratelimit printk_ratelimit() shares global ratelimiting state with all other subsystems, so its usage is discouraged. Instead, define and use dm's local state. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
983c7db3 |
|
25-Sep-2011 |
Milan Broz <mbroz@redhat.com> |
dm crypt: always disable discard_zeroes_data If optional discard support in dm-crypt is enabled, discards requests bypass the crypt queue and blocks of the underlying device are discarded. For the read path, discarded blocks are handled the same as normal ciphertext blocks, thus decrypted. So if the underlying device announces discarded regions return zeroes, dm-crypt must disable this flag because after decryption there is just random noise instead of zeroes. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
498f0103 |
|
01-Aug-2011 |
Mike Snitzer <snitzer@redhat.com> |
dm table: share target argument parsing functions Move multipath target argument parsing code into dm-table so other targets can share it. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
4c259327 |
|
28-May-2011 |
Mike Snitzer <snitzer@redhat.com> |
dm table: allow targets to support discards internally Permit a target to support discards regardless of whether or not all its underlying devices do. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
af1db72d |
|
18-Apr-2011 |
NeilBrown <neilb@suse.de> |
md/dm - remove remains of plug_fn callback. Now that unplugging is done differently, the unplug_fn callback is never called, so it can be completely discarded. Signed-off-by: NeilBrown <neilb@suse.de>
|
#
7eaceacc |
|
10-Mar-2011 |
Jens Axboe <jaxboe@fusionio.com> |
block: remove per-queue plugging Code has been converted over to the new explicit on-stack plugging, and delay users have been converted to use the new API for that. So lets kill off the old plugging along with aops->sync_page(). Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
#
99d03c14 |
|
13-Jan-2011 |
NeilBrown <neilb@suse.de> |
dm: per target unplug callback support Add per-target unplug callback support. Cc: linux-raid@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
9d357b07 |
|
13-Jan-2011 |
NeilBrown <neilb@suse.de> |
dm: introduce target callbacks and congestion callback DM currently implements congestion checking by checking on congestion in each component device. For raid456 we need to also check if the stripe cache is congested. Add per-target congestion checker callback support. Extending the target_callbacks structure with additional callback functions allows for establishing multiple callbacks per-target (a callback is also needed for unplug). Cc: linux-raid@vger.kernel.org Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
56a67df7 |
|
11-Aug-2010 |
Mike Snitzer <snitzer@redhat.com> |
dm: factor out max_io_len_target_boundary Split max_io_len_target_boundary out of max_io_len so that the discard support can make use of it without duplicating max_io_len code. Avoiding max_io_len's split_io logic enables DM's discard support to submit the entire discard request to a target. But discards must still be split on target boundaries. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
5ae89a87 |
|
11-Aug-2010 |
Mike Snitzer <snitzer@redhat.com> |
dm: linear support discard Allow discards to be passed through to linear mappings if at least one underlying device supports it. Discards will be forwarded only to devices that support them. A target that supports discards should set num_discard_requests to indicate how many times each discard request must be submitted to it. Verify table's underlying devices support discards prior to setting the associated DM device as capable of discards (via QUEUE_FLAG_DISCARD). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Joe Thornber <thornber@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
57cba5d3 |
|
11-Aug-2010 |
Mike Snitzer <snitzer@redhat.com> |
dm: rename map_info flush_request to target_request_nr 'target_request_nr' is a more generic name that reflects the fact that it will be used for both flush and discard support. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
8215d6ec |
|
05-Mar-2010 |
Nikanth Karthikesan <knikanth@novell.com> |
dm table: remove unused dm_get_device range parameters Remove unused parameters(start and len) of dm_get_device() and fix the callers. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
64dbce58 |
|
10-Dec-2009 |
Kiyoshi Ueda <k-ueda@ct.jp.nec.com> |
dm: export suspended state to targets This patch adds the exported dm_suspended() function so that targets can check whether or not they are suspended. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
4f186f8b |
|
10-Dec-2009 |
Kiyoshi Ueda <k-ueda@ct.jp.nec.com> |
dm: rename dm_suspended to dm_suspended_md This patch renames dm_suspended() to dm_suspended_md() and keeps it internal to dm. No functional change. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
042d2a9b |
|
10-Dec-2009 |
Alasdair G Kergon <agk@redhat.com> |
dm: keep old table until after resume succeeded When swapping a new table into place, retain the old table until its replacement is in place. An old check for an empty table is removed because this is enforced in populate_table(). __unbind() becomes redundant when followed by __bind(). Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
7c666411 |
|
10-Dec-2009 |
Alasdair G Kergon <agk@redhat.com> |
dm: rename dm_get_table to dm_get_live_table Rename dm_get_table to dm_get_live_table. Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
40bea431 |
|
04-Sep-2009 |
Mike Snitzer <snitzer@redhat.com> |
dm stripe: expose correct io hints Set sensible I/O hints for striped DM devices in the topology infrastructure added for 2.6.31 for userspace tools to obtain via sysfs. Add .io_hints to 'struct target_type' to allow the I/O hints portion (io_min and io_opt) of the 'struct queue_limits' to be set by each target and implement this for dm-stripe. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
5dea271b |
|
23-Jul-2009 |
Mike Snitzer <snitzer@redhat.com> |
dm table: pass correct dev area size to device_area_is_valid Incorrect device area lengths are being passed to device_area_is_valid(). The regression appeared in 2.6.31-rc1 through commit 754c5fc7ebb417b23601a6222a6005cc2e7f2913. With the dm-stripe target, the size of the target (ti->len) was used instead of the stripe_width (ti->len/#stripes). An example of a consequent incorrect error message is: device-mapper: table: 254:0: sdb too small for target Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
cec47e3d |
|
22-Jun-2009 |
Kiyoshi Ueda <k-ueda@ct.jp.nec.com> |
dm: prepare for request based option This patch adds core functions for request-based dm. When struct mapped device (md) is initialized, md->queue has an I/O scheduler and the following functions are used for request-based dm as the queue functions: make_request_fn: dm_make_request() pref_fn: dm_prep_fn() request_fn: dm_request_fn() softirq_done_fn: dm_softirq_done() lld_busy_fn: dm_lld_busy() Actual initializations are done in another patch (PATCH 2). Below is a brief summary of how request-based dm behaves, including: - making request from bio - cloning, mapping and dispatching request - completing request and bio - suspending md - resuming md bio to request ============== md->queue->make_request_fn() (dm_make_request()) calls __make_request() for a bio submitted to the md. Then, the bio is kept in the queue as a new request or merged into another request in the queue if possible. Cloning and Mapping =================== Cloning and mapping are done in md->queue->request_fn() (dm_request_fn()), when requests are dispatched after they are sorted by the I/O scheduler. dm_request_fn() checks busy state of underlying devices using target's busy() function and stops dispatching requests to keep them on the dm device's queue if busy. It helps better I/O merging, since no merge is done for a request once it is dispatched to underlying devices. Actual cloning and mapping are done in dm_prep_fn() and map_request() called from dm_request_fn(). dm_prep_fn() clones not only request but also bios of the request so that dm can hold bio completion in error cases and prevent the bio submitter from noticing the error. (See the "Completion" section below for details.) After the cloning, the clone is mapped by target's map_rq() function and inserted to underlying device's queue using blk_insert_cloned_request(). Completion ========== Request completion can be hooked by rq->end_io(), but then, all bios in the request will have been completed even error cases, and the bio submitter will have noticed the error. To prevent the bio completion in error cases, request-based dm clones both bio and request and hooks both bio->bi_end_io() and rq->end_io(): bio->bi_end_io(): end_clone_bio() rq->end_io(): end_clone_request() Summary of the request completion flow is below: blk_end_request() for a clone request => blk_update_request() => bio->bi_end_io() == end_clone_bio() for each clone bio => Free the clone bio => Success: Complete the original bio (blk_update_request()) Error: Don't complete the original bio => blk_finish_request() => rq->end_io() == end_clone_request() => blk_complete_request() => dm_softirq_done() => Free the clone request => Success: Complete the original request (blk_end_request()) Error: Requeue the original request end_clone_bio() completes the original request on the size of the original bio in successful cases. Even if all bios in the original request are completed by that completion, the original request must not be completed yet to keep the ordering of request completion for the stacking. So end_clone_bio() uses blk_update_request() instead of blk_end_request(). In error cases, end_clone_bio() doesn't complete the original bio. It just frees the cloned bio and gives over the error handling to end_clone_request(). end_clone_request(), which is called with queue lock held, completes the clone request and the original request in a softirq context (dm_softirq_done()), which has no queue lock, to avoid a deadlock issue on submission of another request during the completion: - The submitted request may be mapped to the same device - Request submission requires queue lock, but the queue lock has been held by itself and it doesn't know that The clone request has no clone bio when dm_softirq_done() is called. So target drivers can't resubmit it again even error cases. Instead, they can ask dm core for requeueing and remapping the original request in that cases. suspend ======= Request-based dm uses stopping md->queue as suspend of the md. For noflush suspend, just stops md->queue. For flush suspend, inserts a marker request to the tail of md->queue. And dispatches all requests in md->queue until the marker comes to the front of md->queue. Then, stops dispatching request and waits for the all dispatched requests to complete. After that, completes the marker request, stops md->queue and wake up the waiter on the suspend queue, md->wait. resume ====== Starts md->queue. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
754c5fc7 |
|
22-Jun-2009 |
Mike Snitzer <snitzer@redhat.com> |
dm: calculate queue limits during resume not load Currently, device-mapper maintains a separate instance of 'struct queue_limits' for each table of each device. When the configuration of a device is to be changed, first its table is loaded and this structure is populated, then the device is 'resumed' and the calculated queue_limits are applied. This places restrictions on how userspace may process related devices, where it is often advantageous to 'load' tables for several devices at once before 'resuming' them together. As the new queue_limits only take effect after the 'resume', if they are changing and one device uses another, the latter must be 'resumed' before the former may be 'loaded'. This patch moves the calculation of these queue_limits out of the 'load' operation into 'resume'. Since we are no longer pre-calculating this struct, we no longer need to maintain copies within our dm structs. dm_set_device_limits() now passes the 'start' of the device's data area (aka pe_start) as the 'offset' to blk_stack_limits(). init_valid_queue_limits() is replaced by blk_set_default_limits(). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: martin.petersen@oracle.com Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
af4874e0 |
|
22-Jun-2009 |
Mike Snitzer <snitzer@redhat.com> |
dm target:s introduce iterate devices fn Add .iterate_devices to 'struct target_type' to allow a function to be called for all devices in a DM target. Implemented it for all targets except those in dm-snap.c (origin and snapshot). (The raid1 version number jumps to 1.12 because we originally reserved 1.1 to 1.11 for 'block_on_error' but ended up using 'handle_errors' instead.) Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: martin.petersen@oracle.com
|
#
5ab97588 |
|
22-Jun-2009 |
Mike Snitzer <snitzer@redhat.com> |
dm table: replace struct io_restrictions with struct queue_limits Use blk_stack_limits() to stack block limits (including topology) rather than duplicate the equivalent within Device Mapper. Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
f9ab94ce |
|
22-Jun-2009 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: introduce num_flush_requests Introduce num_flush_requests for a target to set to say how many flush instructions (empty barriers) it wants to receive. These are sent by __clone_and_map_empty_barrier with map_info->flush_request going from 0 to (num_flush_requests - 1). Old targets without flush support won't receive any flush requests. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
e1defc4f |
|
22-May-2009 |
Martin K. Petersen <martin.petersen@oracle.com> |
block: Do away with the notion of hardsect_size Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
#
692d0eb9 |
|
08-Apr-2009 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: remove limited barrier support Prepare for full barrier implementation: first remove the restricted support. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
45194e4f |
|
02-Apr-2009 |
Cheng Renquan <crquan@gmail.com> |
dm target: remove struct tt_internal The tt_internal is really just a list_head to manage registered target_type in a double linked list, Here embed the list_head into target_type directly, 1. to avoid kmalloc/kfree; 2. then tt_internal is really unneeded; Cc: stable@kernel.org Signed-off-by: Cheng Renquan <crquan@gmail.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com>
|
#
ab4c14248 |
|
05-Jan-2009 |
Andi Kleen <ak@linux.intel.com> |
dm: support barriers on simple devices Implement barrier support for single device DM devices This patch implements barrier support in DM for the common case of dm linear just remapping a single underlying device. In this case we can safely pass the barrier through because there can be no reordering between devices. NB. Any DM device might cease to support barriers if it gets reconfigured so code must continue to allow for a possible -EOPNOTSUPP on every barrier bio submitted. - agk Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
7d76345d |
|
05-Jan-2009 |
Kiyoshi Ueda <k-ueda@ct.jp.nec.com> |
dm request: extend target interface This patch adds the following target interfaces for request-based dm. map_rq : for mapping a request rq_end_io : for finishing a request busy : for avoiding performance regression from bio-based dm. Target can tell dm core not to map requests now, and that may help requests in the block layer queue to be bigger by I/O merging. In bio-based dm, this behavior is done by device drivers managing the block layer queue. But in request-based dm, dm core has to do that since dm core manages the block layer queue. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
10d3bd09 |
|
05-Jan-2009 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: consolidate target deregistration error handling Change dm_unregister_target to return void and use BUG() for error reporting. dm_unregister_target can only fail because of programming bug in the target driver. It can't fail because of user's behavior or disk errors. This patch changes unregister_target to return void and use BUG if someone tries to unregister non-registered target or unregister target that is in use. This patch removes code duplication (testing of error codes in all dm targets) and reports bugs in just one place, in dm_unregister_target. In some target drivers, these return codes were ignored, which could lead to a situation where bugs could be missed. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
d63a5ce3 |
|
21-Oct-2008 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: publish array_too_big Move array_too_big to include/linux/device-mapper.h because it is used by targets. Remove the test from dm-raid1 as the number of mirror legs is limited such that it can never fail. (Even for stripes it seems rather unlikely.) Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
647b3d00 |
|
28-Aug-2007 |
Al Viro <viro@zeniv.linux.org.uk> |
[PATCH] lose unused arguments in dm ioctl callbacks Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
aeb5d727 |
|
02-Sep-2008 |
Al Viro <viro@zeniv.linux.org.uk> |
[PATCH] introduce fmode_t, do annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
54160904 |
|
10-Oct-2008 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: publish dm_vcalloc Publish dm_vcalloc in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
ea0ec640 |
|
10-Oct-2008 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: publish dm_table_unplug_all Publish dm_table_unplug_all in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
89343da0 |
|
10-Oct-2008 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: publish dm_get_mapinfo Publish dm_get_mapinfo in include/linux/device-mapper.h because this function is used by targets. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
82b1519b |
|
10-Oct-2008 |
Mikulas Patocka <mpatocka@redhat.com> |
dm: export struct dm_dev Split struct dm_dev in two and publish the part that other targets need in include/linux/device-mapper.h. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
f6fccb12 |
|
20-Jul-2008 |
Milan Broz <mbroz@redhat.com> |
dm: introduce merge_bvec_fn Introduce a bvec merge function for device mapper devices for dynamic size restrictions. This code ensures the requested biovec lies within a single target and then calls a target-specific function to check against any constraints imposed by underlying devices. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
735643ee |
|
30-Apr-2008 |
Robert P. J. Day <rpjday@crashcourse.ca> |
Remove "#ifdef __KERNEL__" checks from unexported headers Remove the "#ifdef __KERNEL__" tests from unexported header files in linux/include whose entire contents are wrapped in that preprocessor test. Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
4fdfe401 |
|
24-Apr-2008 |
Adrian Bunk <bunk@kernel.org> |
dm table: remove unused dm_create_error_table dm_create_error_table() was added in kernel 2.6.18 and never used... Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
0da336e5 |
|
24-Apr-2008 |
Alasdair G Kergon <agk@redhat.com> |
dm: expose macros Make dm.h macros and inlines available in include/linux/device-mapper.h Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
416cd17b |
|
24-Apr-2008 |
Heinz Mauelshagen <hjm@redhat.com> |
dm log: clean interface Clean up the dm-log interface to prepare for publishing it in include/linux. Signed-off-by: Heinz Mauelshagen <hjm@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
4f41b09f |
|
07-Feb-2008 |
Vasily Averin <vvs@sw.ru> |
dm: table remove unused variable Save some bytes. Signed-off-by: Vasily Averin <vvs@sw.ru> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
91212507 |
|
13-Dec-2007 |
Neil Brown <neilb@suse.de> |
dm: merge max_hw_sector Make sure dm honours max_hw_sectors of underlying devices We still have no firm testing evidence in support of this patch but believe it may help to resolve some bug reports. - agk Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
5ec140e6 |
|
31-Oct-2007 |
Vasily Averin <vvs@sw.ru> |
dm: bounce_pfn limit added Device mapper uses its own bounce_pfn that may differ from one on underlying device. In that way dm can build incorrect requests that contain sg elements greater than underlying device is able to handle. This is the cause of slab corruption in i2o layer, occurred on i386 arch when very long direct IO requests are addressed to dm-over-i2o device. Signed-off-by: Vasily Averin <vvs@sw.ru> Cc: <stable@kernel.org> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
#
7a8c3d3b |
|
19-Oct-2007 |
Mike Anderson <andmike@linux.vnet.ibm.com> |
dm: uevent generate events This patch adds support for the dm_path_event dm_send_event functions which create and send udev events. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
96a1f7db |
|
19-Oct-2007 |
Mike Anderson <andmike@linux.vnet.ibm.com> |
dm: export name and uuid This patch adds a function to obtain a copy of a mapped device's name and uuid. Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
|
#
2e93ccc1 |
|
08-Dec-2006 |
Kiyoshi Ueda <k-ueda@ct.jp.nec.com> |
[PATCH] dm: suspend: add noflush pushback In device-mapper I/O is sometimes queued within targets for later processing. For example the multipath target can be configured to store I/O when no paths are available instead of returning it -EIO. This patch allows the device-mapper core to instruct a target to transfer the contents of any such in-target queue back into the core. This frees up the resources used by the target so the core can replace that target with an alternative one and then resend the I/O to it. Without this patch the only way to change the target in such circumstances involves returning the I/O with an error back to the filesystem/application. In the multipath case, this patch will let us add new paths for existing I/O to try after all the existing paths have failed. DMF_NOFLUSH_SUSPENDING ---------------------- If the DM_NOFLUSH_FLAG ioctl option is specified at suspend time, the DMF_NOFLUSH_SUSPENDING flag is set in md->flags during dm_suspend(). It is always cleared before dm_suspend() returns. The flag must be visible while the target is flushing pending I/Os so it is set before presuspend where the flush starts and unset after the wait for md->pending where the flush ends. Target drivers can check this flag by calling dm_noflush_suspending(). DM_MAPIO_REQUEUE / DM_ENDIO_REQUEUE ----------------------------------- A target's map() function can now return DM_MAPIO_REQUEUE to request the device mapper core queue the bio. Similarly, a target's end_io() function can return DM_ENDIO_REQUEUE to request the same. This has been labelled 'pushback'. The __map_bio() and clone_endio() functions in the core treat these return values as errors and call dec_pending() to end the I/O. dec_pending ----------- dec_pending() saves the pushback request in struct dm_io->error. Once all the split clones have ended, dec_pending() will put the original bio on the md->pushback list. Note that this supercedes any I/O errors. It is possible for the suspend with DM_NOFLUSH_FLAG to be aborted while in progress (e.g. by user interrupt). dec_pending() checks for this and returns -EIO if it happened. pushdback list and pushback_lock -------------------------------- The bio is queued on md->pushback temporarily in dec_pending(), and after all pending I/Os return, md->pushback is merged into md->deferred in dm_suspend() for re-issuing at resume time. md->pushback_lock protects md->pushback. The lock should be held with irq disabled because dec_pending() can be called from interrupt context. Queueing bios to md->pushback in dec_pending() must be done atomically with the check for DMF_NOFLUSH_SUSPENDING flag. So md->pushback_lock is held when checking the flag. Otherwise dec_pending() may queue a bio to md->pushback after the interrupted dm_suspend() flushes md->pushback. Then the bio would be left in md->pushback. Flag setting in dm_suspend() can be done without md->pushback_lock because the flag is checked only after presuspend and the set value is already made visible via the target's presuspend function. The flag can be checked without md->pushback_lock (e.g. the first part of the dec_pending() or target drivers), because the flag is checked again with md->pushback_lock held when the bio is really queued to md->pushback as described above. So even if the flag is cleared after the lockless checkings, the bio isn't left in md->pushback but returned to applications with -EIO. Other notes on the current patch -------------------------------- - md->pushback is added to the struct mapped_device instead of using md->deferred directly because md->io_lock which protects md->deferred is rw_semaphore and can't be used in interrupt context like dec_pending(), and md->io_lock protects the DMF_BLOCK_IO flag of md->flags too. - Don't issue lock_fs() in dm_suspend() if the DM_NOFLUSH_FLAG ioctl option is specified, because I/Os generated by lock_fs() would be pushed back and never return if there were no valid devices. - If an error occurs in dm_suspend() after the DMF_NOFLUSH_SUSPENDING flag is set, md->pushback must be flushed because I/Os may be queued to the list already. (flush_and_out label in dm_suspend()) Test results ------------ I have tested using multipath target with the next patch. The following tests are for regression/compatibility: - I/Os succeed when valid paths exist; - I/Os fail when there are no valid paths and queue_if_no_path is not set; - I/Os are queued in the multipath target when there are no valid paths and queue_if_no_path is set; - The queued I/Os above fail when suspend is issued without the DM_NOFLUSH_FLAG ioctl option. I/Os spanning 2 multipath targets also fail. The following tests are for the normal code path of new pushback feature: - Queued I/Os in the multipath target are flushed from the target but don't return when suspend is issued with the DM_NOFLUSH_FLAG ioctl option; - The I/Os above are queued in the multipath target again when resume is issued without path recovery; - The I/Os above succeed when resume is issued after path recovery or table load; - Queued I/Os in the multipath target succeed when resume is issued with the DM_NOFLUSH_FLAG ioctl option after table load. I/Os spanning 2 multipath targets also succeed. The following tests are for the error paths of the new pushback feature: - When the bdget_disk() fails in dm_suspend(), the DMF_NOFLUSH_SUSPENDING flag is cleared and I/Os already queued to the pushback list are flushed properly. - When suspend with the DM_NOFLUSH_FLAG ioctl option is interrupted, o I/Os which had already been queued to the pushback list at the time don't return, and are re-issued at resume time; o I/Os which hadn't been returned at the time return with EIO. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
45cbcd79 |
|
08-Dec-2006 |
Kiyoshi Ueda <k-ueda@ct.jp.nec.com> |
[PATCH] dm: map and endio return code clarification Tighten the use of return values from the target map and end_io functions. Values of 2 and above are now explictly reserved for future use. There are no existing targets using such values. The patch has no effect on existing behaviour. o Reserve return values of 2 and above from target map functions. Any positive value currently indicates "mapping complete", but all existing drivers use the value 1. We now make that a requirement so we can assign new meaning to higher values in future. The new definition of return values from target map functions is: < 0 : error = 0 : The target will handle the io (DM_MAPIO_SUBMITTED). = 1 : Mapping completed (DM_MAPIO_REMAPPED). > 1 : Reserved (undefined). Previously this was the same as '= 1'. o Reserve return values of 2 and above from target end_io functions for similar reasons. DM_ENDIO_INCOMPLETE is introduced for a return value of 1. Test results: I have tested by using the multipath target. I/Os succeed when valid paths exist. I/Os are queued in the multipath target when there are no valid paths and queue_if_no_path is set. I/Os fail when there are no valid paths and queue_if_no_path is not set. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
a3d77d35 |
|
08-Dec-2006 |
Kiyoshi Ueda <k-ueda@ct.jp.nec.com> |
[PATCH] dm: suspend: parameter change Change the interface of dm_suspend() so that we can pass several options without increasing the number of parameters. The existing 'do_lockfs' integer parameter is replaced by a flag DM_SUSPEND_LOCKFS_FLAG. There is no functional change to the code. Test results: I have tested 'dmsetup suspend' command with/without the '--nolockfs' option and confirmed the do_lockfs value is correctly set. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
999d8168 |
|
03-Oct-2006 |
Bryn Reeves <breeves@redhat.com> |
[PATCH] dm table: add target flush This patch adds support for a per-target dm_flush_fn method. This is needed to allow dm-loop to invalidate page cache mappings in response to BLKFLSBUF ioctl commands. Signed-off-by: Bryn Reeves <breeves@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
3cb40214 |
|
03-Oct-2006 |
Bryn Reeves <breeves@redhat.com> |
[PATCH] dm: extract device limit setting Separate the setting of device I/O limits from dm_get_device(). dm-loop will use this. Signed-off-by: Bryn Reeves <breeves@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
8757b776 |
|
03-Oct-2006 |
Milan Broz <mbroz@redhat.com> |
[PATCH] dm table: add target preresume This patch adds a target preresume hook. It is called before the targets are resumed and if it returns an error the resume gets cancelled. The crypt target will use this to indicate that it is unable to process I/O because no encryption key has been supplied. Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
aa129a22 |
|
03-Oct-2006 |
Milan Broz <mbroz@redhat.com> |
[PATCH] dm: support ioctls on mapped devices Extend the core device-mapper infrastructure to accept arbitrary ioctls on a mapped device provided that it has exactly one target and it is capable of supporting ioctls. [We can't use unlocked_ioctl because we need 'inode': 'file' might be NULL. Is it worth changing this?] Signed-off-by: Milan Broz <mbroz@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Arnd Bergmann <arnd@arndb.de> wrote: > Am Wednesday 21 June 2006 21:31 schrieb Alasdair G Kergon: > > static struct block_device_operations dm_blk_dops = { > > .open = dm_blk_open, > > .release = dm_blk_close, > > +.ioctl = dm_blk_ioctl, > > .getgeo = dm_blk_getgeo, > > .owner = THIS_MODULE > > I guess this also needs a ->compat_ioctl method, otherwise it won't > work for ioctl numbers that have a compat_ioctl implementation in the > low-level device driver. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
72d94861 |
|
26-Jun-2006 |
Alasdair G Kergon <agk@redhat.com> |
[PATCH] dm: improve error message consistency Tidy device-mapper error messages to include context information automatically. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
c2ade42d |
|
26-Jun-2006 |
David Teigland <teigland@redhat.com> |
[PATCH] dm: create error table Add a library function dm_create_error_table() to create a table that rejects any I/O sent to a device with EIO. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
17b2f66f |
|
26-Jun-2006 |
Alasdair G Kergon <agk@redhat.com> |
[PATCH] dm: add exports Move definitions of core device-mapper functions for manipulating mapped devices and their tables to <linux/device-mapper.h> advertising their availability for use elsewhere in the kernel. Protect the contents of device-mapper.h with ifdef __KERNEL__. And throw in a few formatting clean-ups and extra comments. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
969429b5 |
|
27-Mar-2006 |
NeilBrown <neilb@suse.de> |
[PATCH] dm: make sure QUEUE_FLAG_CLUSTER is set properly This flag should be set for a virtual device iff it is set for all underlying devices. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
3ee247eb |
|
01-Feb-2006 |
Alasdair G Kergon <agk@redhat.com> |
[PATCH] dm: dm-table warning fix drivers/md/dm-table.c:500: warning: comparison of distinct pointer types lacks a cast Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
1da177e4 |
|
16-Apr-2005 |
Linus Torvalds <torvalds@ppc970.osdl.org> |
Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
|