History log of /linux-master/drivers/md/Kconfig
Revision Date Author Comments
# cd14b018 11-Feb-2024 Masahiro Yamada <masahiroy@kernel.org>

treewide: replace or remove redundant def_bool in Kconfig files

'def_bool X' is a shorthand for 'bool' plus 'default X'.

'def_bool' is redundant where 'bool' is already present, so 'def_bool X'
can be replaced with 'default X', or removed if X is 'n'.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>


# f36b1d3b 26-Jan-2024 Mike Snitzer <snitzer@kernel.org>

dm vdo: use a proper Makefile for dm-vdo

Requires moving dm-vdo-target.c into drivers/md/dm-vdo/

This change adds a proper drivers/md/dm-vdo/Makefile and eliminates
the abnormal use of patsubst in drivers/md/Makefile -- which was the
cause of at least one build failure that was reported by the upstream
build bot.

Also, split out VDO's drivers/md/dm-vdo/Kconfig and include it from
drivers/md/Kconfig

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Matthew Sakai <msakai@redhat.com>


# f11aca85 16-Nov-2023 Matthew Sakai <msakai@redhat.com>

dm vdo: enable configuration and building of dm-vdo

dm-vdo targets are not supported for 32-bit configurations. A vdo target
typically requires 1 to 1.5 GB of memory at any given time, which is likely
a large fraction of the addressable memory of a 32-bit system. At the same
time, the amount of addressable storage attached to a 32-bit system may not
be large enough for deduplication to provide much benefit. Because of these
concerns, 32-bit platforms are deemed unlikely to benefit from using a vdo
target, so dm-vdo is targeted only at 64-bit platforms.

Co-developed-by: J. corwin Coburn <corwin@hurlbutnet.net>
Signed-off-by: J. corwin Coburn <corwin@hurlbutnet.net>
Co-developed-by: John Wiele <jwiele@redhat.com>
Signed-off-by: John Wiele <jwiele@redhat.com>
Signed-off-by: Matthew Sakai <msakai@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>


# 415c7451 14-Dec-2023 Song Liu <song@kernel.org>

md: Remove deprecated CONFIG_MD_FAULTY

md-faulty has been marked as deprecated for 2.5 years. Remove it.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Neil Brown <neilb@suse.de>
Cc: Guoqing Jiang <guoqing.jiang@linux.dev>
Cc: Mateusz Grzonka <mateusz.grzonka@intel.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231214222107.2016042-4-song@kernel.org


# d8730f0c 14-Dec-2023 Song Liu <song@kernel.org>

md: Remove deprecated CONFIG_MD_MULTIPATH

md-multipath has been marked as deprecated for 2.5 years. Remove it.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Neil Brown <neilb@suse.de>
Cc: Guoqing Jiang <guoqing.jiang@linux.dev>
Cc: Mateusz Grzonka <mateusz.grzonka@intel.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231214222107.2016042-3-song@kernel.org


# 849d18e2 14-Dec-2023 Song Liu <song@kernel.org>

md: Remove deprecated CONFIG_MD_LINEAR

md-linear has been marked as deprecated for 2.5 years. Remove it.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Neil Brown <neilb@suse.de>
Cc: Guoqing Jiang <guoqing.jiang@linux.dev>
Cc: Mateusz Grzonka <mateusz.grzonka@intel.com>
Cc: Jes Sorensen <jes@trained-monkey.org>
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231214222107.2016042-2-song@kernel.org


# 6849302f 13-Dec-2023 Mike Snitzer <snitzer@kernel.org>

dm audit: fix Kconfig so DM_AUDIT depends on BLK_DEV_DM

Signed-off-by: Mike Snitzer <snitzer@kernel.org>


# 925c86a1 01-Aug-2023 Christoph Hellwig <hch@lst.de>

fs: add CONFIG_BUFFER_HEAD

Add a new config option that controls building the buffer_head code, and
select it from all file systems and stacking drivers that need it.

For the block device nodes and alternative iomap based buffered I/O path
is provided when buffer_head support is not enabled, and iomap needs a
a small tweak to define the IOMAP_F_BUFFER_HEAD flag to 0 to not call
into the buffer_head code when it doesn't exist.

Otherwise this is just Kconfig and ifdef changes.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230801172201.1923299-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 0ae1c9d3 15-Jun-2023 Christoph Hellwig <hch@lst.de>

md: deprecate bitmap file support

The support for bitmaps on files is a very bad idea abusing various kernel
APIs, and fundamentally requires the file to not be on the actual array
without a way to check that this is actually the case. Add a deprecation
warning to see if we might be able to eventually drop it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230615064840.629492-12-hch@lst.de


# a34d4ef8 15-Jun-2023 Christoph Hellwig <hch@lst.de>

md: make bitmap file support optional

The support for write intent bitmaps in files on an external files in md
is a hot mess that abuses ->bmap to map file offsets into physical device
objects, and also abuses buffer_heads in a creative way.

Make this code optional so that MD can be built into future kernels
without buffer_head support, and so that we can eventually deprecate it.

Note this does not affect the internal bitmap support, which has none of
the problems.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230615064840.629492-11-hch@lst.de


# 6c0f5898 13-Mar-2023 NeilBrown <neilb@suse.de>

md: select BLOCK_LEGACY_AUTOLOAD

When BLOCK_LEGACY_AUTOLOAD is not enable, mdadm is not able to
activate new arrays unless "CREATE names=yes" appears in
mdadm.conf

As this is a regression we need to always enable BLOCK_LEGACY_AUTOLOAD
for when MD is selected - at least until mdadm is updated and the
updates widely available.

Cc: stable@vger.kernel.org # v5.18+
Fixes: fbdee71bb5d8 ("block: deprecate autoloading based on dev_t")
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Song Liu <song@kernel.org>


# 9276cf8b 22-Nov-2022 Paul E. McKenney <paulmck@kernel.org>

drivers/md: Remove "select SRCU"

Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in selecting it. Therefore, remove the "select SRCU"
Kconfig statements.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: <dm-devel@redhat.com>
Cc: <linux-raid@vger.kernel.org>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>


# 248c7933 15-Feb-2022 Christoph Hellwig <hch@lst.de>

blk-mq: make the blk-mq stacking code optional

The code to stack blk-mq drivers is only used by dm-multipath, and
will preferably stay that way. Make it optional and only selected
by device mapper, so that the buildbots more easily catch abuses
like the one that slipped in in the ufs driver in the last merged
window. Another positive side effects is that kernel builds without
device mapper shrink a little bit as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220215100540.3892965-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 82bb8599 04-Sep-2021 Michael Weiß <michael.weiss@aisec.fraunhofer.de>

dm integrity: log audit events for dm-integrity target

dm-integrity signals integrity violations by returning I/O errors
to user space. To identify integrity violations by a controlling
instance, the kernel audit subsystem can be used to emit audit
events to user space. We use the new dm-audit submodule allowing
to emit audit events on relevant I/O errors.

The construction and destruction of integrity device mappings are
also relevant for auditing a system. Thus, those events are also
logged as audit events.

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 2cc1ae48 04-Sep-2021 Michael Weiß <michael.weiss@aisec.fraunhofer.de>

dm: introduce audit event module for device mapper

To be able to send auditing events to user space, we introduce a
generic dm-audit module. It provides helper functions to emit audit
events through the kernel audit subsystem. We claim the
AUDIT_DM_CTRL type=1336 and AUDIT_DM_EVENT type=1337 out of the
audit event messages range in the corresponding userspace api in
'include/uapi/linux/audit.h' for those events.

AUDIT_DM_CTRL is used to provide information about creation and
destruction of device mapper targets which are triggered by user space
admin control actions.
AUDIT_DM_EVENT is used to provide information about actual errors
during operation of the mapped device, showing e.g. integrity
violations in audit log.

Following commits to device mapper targets actually will make use of
this to emit those events in relevant cases.

The audit logs look like this if executing the following simple test:

# dd if=/dev/zero of=test.img bs=1M count=1024
# losetup -f test.img
# integritysetup -vD format --integrity sha256 -t 32 /dev/loop0
# integritysetup open -D /dev/loop0 --integrity sha256 integritytest
# integritysetup status integritytest
# integritysetup close integritytest
# integritysetup open -D /dev/loop0 --integrity sha256 integritytest
# integritysetup status integritytest
# dd if=/dev/urandom of=/dev/loop0 bs=512 count=1 seek=100000
# dd if=/dev/mapper/integritytest of=/dev/null

-------------------------
audit.log from auditd

type=UNKNOWN[1336] msg=audit(1630425039.363:184): module=integrity
op=ctr ppid=3807 pid=3819 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0
egid=0 sgid=0 fsgid=0 tty=pts2 ses=3 comm="integritysetup"
exe="/sbin/integritysetup" subj==unconfined dev=254:3
error_msg='success' res=1
type=UNKNOWN[1336] msg=audit(1630425039.471:185): module=integrity
op=dtr ppid=3807 pid=3819 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0
egid=0 sgid=0 fsgid=0 tty=pts2 ses=3 comm="integritysetup"
exe="/sbin/integritysetup" subj==unconfined dev=254:3
error_msg='success' res=1
type=UNKNOWN[1336] msg=audit(1630425039.611:186): module=integrity
op=ctr ppid=3807 pid=3819 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0
egid=0 sgid=0 fsgid=0 tty=pts2 ses=3 comm="integritysetup"
exe="/sbin/integritysetup" subj==unconfined dev=254:3
error_msg='success' res=1
type=UNKNOWN[1336] msg=audit(1630425054.475:187): module=integrity
op=dtr ppid=3807 pid=3819 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0
egid=0 sgid=0 fsgid=0 tty=pts2 ses=3 comm="integritysetup"
exe="/sbin/integritysetup" subj==unconfined dev=254:3
error_msg='success' res=1

type=UNKNOWN[1336] msg=audit(1630425073.171:191): module=integrity
op=ctr ppid=3807 pid=3883 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0
egid=0 sgid=0 fsgid=0 tty=pts2 ses=3 comm="integritysetup"
exe="/sbin/integritysetup" subj==unconfined dev=254:3
error_msg='success' res=1

type=UNKNOWN[1336] msg=audit(1630425087.239:192): module=integrity
op=dtr ppid=3807 pid=3902 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0
egid=0 sgid=0 fsgid=0 tty=pts2 ses=3 comm="integritysetup"
exe="/sbin/integritysetup" subj==unconfined dev=254:3
error_msg='success' res=1

type=UNKNOWN[1336] msg=audit(1630425093.755:193): module=integrity
op=ctr ppid=3807 pid=3906 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0
egid=0 sgid=0 fsgid=0 tty=pts2 ses=3 comm="integritysetup"
exe="/sbin/integritysetup" subj==unconfined dev=254:3
error_msg='success' res=1

type=UNKNOWN[1337] msg=audit(1630425112.119:194): module=integrity
op=integrity-checksum dev=254:3 sector=77480 res=0
type=UNKNOWN[1337] msg=audit(1630425112.119:195): module=integrity
op=integrity-checksum dev=254:3 sector=77480 res=0
type=UNKNOWN[1337] msg=audit(1630425112.119:196): module=integrity
op=integrity-checksum dev=254:3 sector=77480 res=0
type=UNKNOWN[1337] msg=audit(1630425112.119:197): module=integrity
op=integrity-checksum dev=254:3 sector=77480 res=0
type=UNKNOWN[1337] msg=audit(1630425112.119:198): module=integrity
op=integrity-checksum dev=254:3 sector=77480 res=0
type=UNKNOWN[1337] msg=audit(1630425112.119:199): module=integrity
op=integrity-checksum dev=254:3 sector=77480 res=0
type=UNKNOWN[1337] msg=audit(1630425112.119:200): module=integrity
op=integrity-checksum dev=254:3 sector=77480 res=0
type=UNKNOWN[1337] msg=audit(1630425112.119:201): module=integrity
op=integrity-checksum dev=254:3 sector=77480 res=0
type=UNKNOWN[1337] msg=audit(1630425112.119:202): module=integrity
op=integrity-checksum dev=254:3 sector=77480 res=0
type=UNKNOWN[1337] msg=audit(1630425112.119:203): module=integrity
op=integrity-checksum dev=254:3 sector=77480 res=0

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Signed-off-by: Paul Moore <paul@paul-moore.com> # fix audit.h numbering
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 1c277e50 04-Aug-2021 Christoph Hellwig <hch@lst.de>

dm: make EBS depend on !HIGHMEM

__ebs_rw_bvec use page_address on the submitted bios data, and thus
can't deal with highmem. Disable the target on highmem configs.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210804095634.460779-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# c66fd019 04-Aug-2021 Christoph Hellwig <hch@lst.de>

block: make the block holder code optional

Move the block holder code into a separate file as it is not in any way
related to the other block_dev.c code, and add a new selectable config
option for it so that we don't have to build it without any remapped
drivers selected.

The Kconfig symbol contains a _DEPRECATED suffix to match the comments
added in commit 49731baa41df
("block: restore multiple bd_link_disk_holder() support").

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20210804094147.459763-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# 608f52e3 25-May-2021 Guoqing Jiang <jgq516@gmail.com>

md: mark some personalities as deprecated

Mark the three personalities (linear, fault and multipath) as deprecated
because:

1. people can use dm multipath or nvme multipath.
2. linear is already deprecated in MODULE_ALIAS.
3. no one actively using fault.

Signed-off-by: Guoqing Jiang <jiangguoqing@kylinos.cn>
Signed-off-by: Song Liu <song@kernel.org>


# 363880c4 22-Jan-2021 Ahmad Fatoum <a.fatoum@pengutronix.de>

dm crypt: support using trusted keys

Commit 27f5411a718c ("dm crypt: support using encrypted keys") extended
dm-crypt to allow use of "encrypted" keys along with "user" and "logon".

Along the same lines, teach dm-crypt to support "trusted" keys as well.

Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# b690bd54 03-Jan-2021 Arnd Bergmann <arnd@arndb.de>

dm zoned: select CONFIG_CRC32

Without crc32 support, this driver fails to link:

arm-linux-gnueabi-ld: drivers/md/dm-zoned-metadata.o: in function `dmz_write_sb':
dm-zoned-metadata.c:(.text+0xe98): undefined reference to `crc32_le'
arm-linux-gnueabi-ld: drivers/md/dm-zoned-metadata.o: in function `dmz_check_sb':
dm-zoned-metadata.c:(.text+0x7978): undefined reference to `crc32_le'

Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# f7b347ac 14-Dec-2020 Anthony Iliopoulos <ailiop@suse.com>

dm integrity: select CRYPTO_SKCIPHER

The integrity target relies on skcipher for encryption/decryption, but
certain kernel configurations may not enable CRYPTO_SKCIPHER, leading to
compilation errors due to unresolved symbols. Explicitly select
CRYPTO_SKCIPHER for DM_INTEGRITY, since it is unconditionally dependent
on it.

Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# e4d2e82b 22-Oct-2020 Mike Christie <michael.christie@oracle.com>

dm mpath: add IO affinity path selector

This patch adds a path selector that selects paths based on a CPU to
path mapping the user passes in and what CPU we are executing on. The
primary user for this PS is where the app is optimized to use specific
CPUs so other PSs undo the apps handy work, and the storage and it's
transport are not a bottlneck.

For these io-affinity PS setups a path's transport/interconnect
perf is not going to flucuate a lot and there is no major differences
between paths, so QL/HST smarts do not help and RR always messes up
what the app is trying to do.

On a system with 16 cores, where you have a job per CPU:

fio --filename=/dev/dm-0 --direct=1 --rw=randrw --bs=4k \
--ioengine=libaio --iodepth=128 --numjobs=16

and a dm-multipath device setup where each CPU is mapped to one path:

// When in mq mode I had to set dm_mq_nr_hw_queues=$NUM_PATHS.
// Bio mode also showed similar results.
0 16777216 multipath 0 0 1 1 io-affinity 0 16 1 8:16 1 8:32 2 8:64 4
8:48 8 8:80 10 8:96 20 8:112 40 8:128 80 8:144 100 8:160 200 8:176
400 8:192 800 8:208 1000 8:224 2000 8:240 4000 65:0 8000

we can see a IOPs increase of 25%.

The percent increase depends on the device and interconnect. For a
slower/medium speed path/device that can do around 180K IOPs a path
if you ran that fio command to it directly we saw a 25% increase like
above. Slower path'd devices that could do around 90K per path showed
maybe around a 2 - 5% increase. If you use something like null_blk or
scsi_debug which can multi-million IOPs and hack it up so each device
they export shows up as a path then you see 50%+ increases.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 4da8f8c8 23-Oct-2020 Mickaël Salaün <mic@linux.microsoft.com>

dm verity: Add support for signature verification with 2nd keyring

Add a new configuration DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
to enable dm-verity signatures to be verified against the secondary
trusted keyring. Instead of relying on the builtin trusted keyring
(with hard-coded certificates), the second trusted keyring can include
certificate authorities from the builtin trusted keyring and child
certificates loaded at run time. Using the secondary trusted keyring
enables to use dm-verity disks (e.g. loop devices) signed by keys which
did not exist at kernel build time, leveraging the certificate chain of
trust model. In practice, this makes it possible to update certificates
without kernel update and reboot, aligning with module and kernel
(kexec) signature verification which already use the secondary trusted
keyring.

Signed-off-by: Mickaël Salaün <mic@linux.microsoft.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 6f3bc22b 26-Jun-2020 Alexander A. Klimov <grandmaster@al2klimov.de>

Replace HTTP links with HTTPS ones: LVM

Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Link: https://lore.kernel.org/r/20200627103138.71885-1-grandmaster@al2klimov.de
Signed-off-by: Jonathan Corbet <corbet@lwn.net>


# a7f7f624 13-Jun-2020 Masahiro Yamada <masahiroy@kernel.org>

treewide: replace '---help---' in Kconfig files with 'help'

Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.

This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.

There are a variety of indentation styles found.

a) 4 spaces + '---help---'
b) 7 spaces + '---help---'
c) 8 spaces + '---help---'
d) 1 space + 1 tab + '---help---'
e) 1 tab + '---help---' (correct indentation)
f) 1 tab + 1 space + '---help---'
g) 1 tab + 2 spaces + '---help---'

In order to convert all of them to 1 tab + 'help', I ran the
following commend:

$ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>


# 2613eab1 30-Apr-2020 Khazhismel Kumykov <khazhy@google.com>

dm mpath: add Historical Service Time Path Selector

This new selector keeps an exponential moving average of the service
time for each path (losely defined as delta between start_io and
end_io), and uses this along with the number of inflight requests to
estimate future service time for a path. Since we don't have a prober
to account for temporally slow paths, re-try "slow" paths every once in
a while (num_paths * historical_service_time). To account for fast paths
transitioning to slow, if a path has not completed any request within
(num_paths * historical_service_time), limit the number of outstanding
requests. To account for low volume situations where number of
inflight IOs would be zero, the last finish time of each path is
factored in.

Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Co-developed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# d3c7b35c 09-Mar-2020 Heinz Mauelshagen <heinzm@redhat.com>

dm: add emulated block size target

This new target is similar to the linear target except that it emulates
a smaller logical block size on a device with a larger logical block
size. Its main purpose is to emulate 512 byte sectors on 4K native
disks (i.e. 512e).

See Documentation/admin-guide/device-mapper/dm-ebs.rst for details.

Reviewed-by: Damien Le Moal <DamienLeMoal@wdc.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org> [Kconfig fixes]
Signed-off-by: Zheng Bin <zhengbin13@huawei.com> [static fixes]
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 27f5411a 20-Apr-2020 Dmitry Baryshkov <dbaryshkov@gmail.com>

dm crypt: support using encrypted keys

Allow one to use "encrypted" in addition to "user" and "logon" key
types for device encryption.

Signed-off-by: Dmitry Baryshkov <dmitry_baryshkov@mentor.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 44363322 20-Nov-2019 Krzysztof Kozlowski <krzk@kernel.org>

dm: Fix Kconfig indentation

Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
$ sed -e 's/^ /\t/' -i */Kconfig

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 7431b783 11-Sep-2019 Nikos Tsironis <ntsironis@arrikto.com>

dm: add clone target

Add the dm-clone target, which allows cloning of arbitrary block
devices.

dm-clone produces a one-to-one copy of an existing, read-only source
device into a writable destination device: It presents a virtual block
device which makes all data appear immediately, and redirects reads and
writes accordingly.

The main use case of dm-clone is to clone a potentially remote,
high-latency, read-only, archival-type block device into a writable,
fast, primary-type device for fast, low-latency I/O. The cloned device
is visible/mountable immediately and the copy of the source device to
the destination device happens in the background, in parallel with user
I/O.

When the cloning completes, the dm-clone table can be removed altogether
and be replaced, e.g., by a linear table, mapping directly to the
destination device.

For further information and examples of how to use dm-clone, please read
Documentation/admin-guide/device-mapper/dm-clone.rst

Suggested-by: Vangelis Koukis <vkoukis@arrikto.com>
Co-developed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# a1a262b6 19-Aug-2019 Ard Biesheuvel <ardb@kernel.org>

dm crypt: switch to ESSIV crypto API template

Replace the explicit ESSIV handling in the dm-crypt driver with calls
into the crypto API, which now possesses the capability to perform
this processing within the crypto subsystem.

Note that we reorder the AEAD cipher_api string parsing with the TFM
instantiation: this is needed because cipher_api is mangled by the
ESSIV handling, and throws off the parsing of "authenc(" otherwise.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 88cd3e6c 17-Jul-2019 Jaskaran Khurana <jaskarankhurana@linux.microsoft.com>

dm verity: add root hash pkcs#7 signature verification

The verification is to support cases where the root hash is not secured
by Trusted Boot, UEFI Secureboot or similar technologies.

One of the use cases for this is for dm-verity volumes mounted after
boot, the root hash provided during the creation of the dm-verity volume
has to be secure and thus in-kernel validation implemented here will be
used before we trust the root hash and allow the block device to be
created.

The signature being provided for verification must verify the root hash
and must be trusted by the builtin keyring for verification to succeed.

The hash is added as a key of type "user" and the description is passed
to the kernel so it can look it up and use it for verification.

Adds CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG which can be turned on if root
hash verification is needed.

Kernel commandline dm_verity module parameter 'require_signatures' will
indicate whether to force root hash signature verification (for all dm
verity volumes).

Signed-off-by: Jaskaran Khurana <jaskarankhurana@linux.microsoft.com>
Tested-and-Reviewed-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 6cf2a73c 17-Jun-2019 Mauro Carvalho Chehab <mchehab+samsung@kernel.org>

docs: device-mapper: move it to the admin-guide

The DM support describes lots of aspects related to mapped
disk partitions from the userspace PoV.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>


# f0ba4377 12-Jun-2019 Mauro Carvalho Chehab <mchehab+samsung@kernel.org>

docs: convert docs to ReST and rename to *.rst

The conversion is actually:
- add blank lines and indentation in order to identify paragraphs;
- fix tables markups;
- add some lists markups;
- mark literal blocks;
- adjust title markups.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>


# ec8f24b7 19-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Add SPDX license identifier - Makefile/Kconfig

Add SPDX license identifiers to all Make/Kconfig files which:

- Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# e4f3fabd 07-Mar-2019 Bryan Gurney <bgurney@redhat.com>

dm: add dust target

Add the dm-dust target, which simulates the behavior of bad sectors
at arbitrary locations, and the ability to enable the emulation of
the read failures at an arbitrary time.

This target behaves similarly to a linear target. At a given time,
the user can send a message to the target to start failing read
requests on specific blocks. When the failure behavior is enabled,
reads of blocks configured "bad" will fail with EIO.

Writes of blocks configured "bad" will result in the following:

1. Remove the block from the "bad block list".
2. Successfully complete the write.

After this point, the block will successfully contain the written
data, and will service reads and writes normally. This emulates the
behavior of a "remapped sector" on a hard disk drive.

dm-dust provides logging of which blocks have been added or removed
to the "bad block list", as well as logging when a block has been
removed from the bad block list. These messages can be used
alongside the messages from the driver using a dm-dust device to
analyze the driver's behavior when a read fails at a given time.

(This logging can be reduced via a "quiet" mode, if desired.)

NOTE: If the block size is larger than 512 bytes, only the first sector
of each "dust block" is detected. Placing a limiting layer above a dust
target, to limit the minimum I/O size to the dust block size, will
ensure proper emulation of the given large block size.

Signed-off-by: Bryan Gurney <bgurney@redhat.com>
Co-developed-by: Joe Shimkus <jshimkus@redhat.com>
Co-developed-by: John Dorminy <jdorminy@redhat.com>
Co-developed-by: John Pittman <jpittman@redhat.com>
Co-developed-by: Thomas Jaskiewicz <tjaskiew@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 6bbc923d 21-Feb-2019 Helen Koike <helen.koike@collabora.com>

dm: add support to directly boot to a mapped device

Add a "create" module parameter, which allows device-mapper targets to
be configured at boot time. This enables early use of DM targets in the
boot process (as the root device or otherwise) without the need of an
initramfs.

The syntax used in the boot param is based on the concise format from
the dmsetup tool to follow the rule of least surprise:

dmsetup table --concise /dev/mapper/lroot

Which is:
dm-mod.create=<name>,<uuid>,<minor>,<flags>,<table>[,<table>+][;<name>,<uuid>,<minor>,<flags>,<table>[,<table>+]+]

Where,
<name> ::= The device name.
<uuid> ::= xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | ""
<minor> ::= The device minor number | ""
<flags> ::= "ro" | "rw"
<table> ::= <start_sector> <num_sectors> <target_type> <target_args>
<target_type> ::= "verity" | "linear" | ...

For example, the following could be added in the boot parameters:
dm-mod.create="lroot,,,rw, 0 4096 linear 98:16 0, 4096 4096 linear 98:32 0" root=/dev/dm-0

Only the targets that were tested are allowed and the ones that don't
change any block device when the device is create as read-only. For
example, mirror and cache targets are not allowed. The rationale behind
this is that if the user makes a mistake, choosing the wrong device to
be the mirror or the cache can corrupt data.

The only targets initially allowed are:
* crypt
* delay
* linear
* snapshot-origin
* striped
* verity

Co-developed-by: Will Drewry <wad@chromium.org>
Co-developed-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Helen Koike <helen.koike@collabora.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 6a23e05c 10-Oct-2018 Jens Axboe <axboe@kernel.dk>

dm: remove legacy request-based IO path

dm supports both, and since we're killing off the legacy path in
general, get rid of it in dm.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 48debafe 08-Mar-2018 Mikulas Patocka <mpatocka@redhat.com>

dm: add writecache target

The writecache target caches writes on persistent memory or SSD.
It is intended for databases or other programs that need extremely low
commit latency.

The writecache target doesn't cache reads because reads are supposed to
be cached in page cache in normal RAM.

If persistent memory isn't available this target can still be used in
SSD mode.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com> # fix missing goto
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> # fix compilation issue with !DAX
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> # use msecs_to_jiffies
Acked-by: Dan Williams <dan.j.williams@intel.com> # reworks to unify ARM and x86 flushing
Signed-off-by: Mike Snitzer <msnitzer@redhat.com>


# 976431b0 29-Mar-2018 Dan Williams <dan.j.williams@intel.com>

dax, dm: allow device-mapper to operate without dax support

Change device-mapper's DAX dependency to require the presence of at
least one DAX_DRIVER. This allows device-mapper to be built without
bringing the DAX core along which is especially wasteful when there are
no DAX drivers, like BLK_DEV_PMEM, configured.

Cc: Alasdair Kergon <agk@redhat.com>
Reported-by: Bart Van Assche <Bart.VanAssche@wdc.com>
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>


# 18a5bf27 18-Dec-2017 Scott Bauer <scott.bauer@intel.com>

dm: add unstriped target

This device mapper "unstriped" target remaps and unstripes I/O so it
is issued solely on a single drive in a HW RAID0 or dm-striped target.

In a 4 drive HW RAID0 the striped target exposes 1/4th of the LBA range
as a virtual drive. Each I/O to that virtual drive will only be issued
to the 1 drive that was selected of the 4 drives in the HW RAID0.

This unstriped target is most useful for Intel NVMe drives that have
multiple cores but that do not have firmware control to pin separate LBA
ranges to each discrete cpu core.

Signed-off-by: Scott Bauer <scott.bauer@intel.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# f0e230ad 24-Oct-2017 Guoqing Jiang <gqjiang@suse.com>

md-cluster: update document for raid10

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>


# 3b1a94c8 07-Jun-2017 Damien Le Moal <damien.lemoal@wdc.com>

dm zoned: drive-managed zoned block device target

The dm-zoned device mapper target provides transparent write access
to zoned block devices (ZBC and ZAC compliant block devices).
dm-zoned hides to the device user (a file system or an application
doing raw block device accesses) any constraint imposed on write
requests by the device, equivalent to a drive-managed zoned block
device model.

Write requests are processed using a combination of on-disk buffering
using the device conventional zones and direct in-place processing for
requests aligned to a zone sequential write pointer position.
A background reclaim process implemented using dm_kcopyd_copy ensures
that conventional zones are always available for executing unaligned
write requests. The reclaim process overhead is minimized by managing
buffer zones in a least-recently-written order and first targeting the
oldest buffer zones. Doing so, blocks under regular write access (such
as metadata blocks of a file system) remain stored in conventional
zones, resulting in no apparent overhead.

dm-zoned implementation focus on simplicity and on minimizing overhead
(CPU, memory and storage overhead). For a 14TB host-managed disk with
256 MB zones, dm-zoned memory usage per disk instance is at most about
3 MB and as little as 5 zones will be used internally for storing metadata
and performing buffer zone reclaim operations. This is achieved using
zone level indirection rather than a full block indirection system for
managing block movement between zones.

dm-zoned primary target is host-managed zoned block devices but it can
also be used with host-aware device models to mitigate potential
device-side performance degradation due to excessive random writing.

Zoned block devices can be formatted and checked for use with the dm-zoned
target using the dmzadm utility available at:

https://github.com/hgst/dm-zoned-tools

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
[Mike Snitzer partly refactored Damien's original work to cleanup the code]
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 7ab84db6 04-May-2017 Mike Snitzer <snitzer@redhat.com>

dm integrity: improve the Kconfig help text for DM_INTEGRITY

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Milan Broz <gmazyland@gmail.com>


# f26c5719 12-Apr-2017 Dan Williams <dan.j.williams@intel.com>

dm: add dax_device and dax_operations support

Allocate a dax_device to represent the capacity of a device-mapper
instance. Provide a ->direct_access() method via the new dax_operations
indirection that mirrors the functionality of the current direct_access
support via block_device_operations. Once fs/dax.c has been converted
to use dax_operations the old dm_blk_direct_access() will be removed.

A new helper dm_dax_get_live_target() is introduced to separate some of
the dm-specifics from the direct_access implementation.

This enabling is only for the top-level dm representation to upper
layers. Converting target direct_access implementations is deferred to a
separate patch.

Cc: Toshi Kani <toshi.kani@hpe.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>


# 7b81ef8b 27-Mar-2017 Mikulas Patocka <mpatocka@redhat.com>

dm raid: select the Kconfig option CONFIG_MD_RAID0

Since the commit 0cf4503174c1 ("dm raid: add support for the MD RAID0
personality"), the dm-raid subsystem can activate a RAID-0 array.
Therefore, add MD_RAID0 to the dependencies of DM_RAID, so that MD_RAID0
will be selected when DM_RAID is selected.

Fixes: 0cf4503174c1 ("dm raid: add support for the MD RAID0 personality")
Cc: stable@vger.kernel.org # v4.2+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 4f6cce39 27-Mar-2017 SeongJae Park <sj38.park@gmail.com>

Fix dead URLs to ftp.kernel.org

URLs to ftp.kernel.org are still exist though the service is closed [0].
This commit fixes the URLs to use www.kernel.org instead.

[0] https://www.kernel.org/shutting-down-ftp-services.html

Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>


# 7eada909 04-Jan-2017 Mikulas Patocka <mpatocka@redhat.com>

dm: add integrity target

The dm-integrity target emulates a block device that has additional
per-sector tags that can be used for storing integrity information.

A general problem with storing integrity tags with every sector is that
writing the sector and the integrity tag must be atomic - i.e. in case of
crash, either both sector and integrity tag or none of them is written.

To guarantee write atomicity the dm-integrity target uses a journal. It
writes sector data and integrity tags into a journal, commits the journal
and then copies the data and integrity tags to their respective location.

The dm-integrity target can be used with the dm-crypt target - in this
situation the dm-crypt target creates the integrity data and passes them
to the dm-integrity target via bio_integrity_payload attached to the bio.
In this mode, the dm-crypt and dm-integrity targets provide authenticated
disk encryption - if the attacker modifies the encrypted device, an I/O
error is returned instead of random data.

The dm-integrity target can also be used as a standalone target, in this
mode it calculates and verifies the integrity tag internally. In this
mode, the dm-integrity target can be used to detect silent data
corruption on the disk or in the I/O path.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# b29d4986 15-Dec-2016 Joe Thornber <ejt@redhat.com>

dm cache: significant rework to leverage dm-bio-prison-v2

The cache policy interfaces have been updated to work well with the new
bio-prison v2 interface's ability to queue work immediately (promotion,
demotion, etc) -- overriding benefit being reduced latency on processing
IO through the cache. Previously such work would be left for the DM
cache core to queue on various lists and then process in batches later
-- this caused a serious delay in latency for IO driven by the cache.

The background tracker code was factored out so that all cache policies
can make use of it.

Also, the "cleaner" policy has been removed and is now a variant of the
smq policy that simply disallows migrations.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 2e8ed711 19-Nov-2015 Joe Thornber <ejt@redhat.com>

dm block manager: make block locking optional

The block manager's locking is useful for catching cycles that may
result from certain btree metadata corruption. But in general it serves
as a developer tool to catch bugs in code. Unless you're finding that
DM thin provisioning is hanging due to infinite loops within the block
manager's access to btree nodes you can safely disable this feature.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de> # do/while(0) macro fix
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 3f068040 04-Mar-2016 Mike Snitzer <snitzer@redhat.com>

dm: add missing newline between DM_DEBUG_BLOCK_STACK_TRACING and DM_BUFIO

Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 9ed84698 10-Feb-2016 Joe Thornber <ejt@redhat.com>

dm cache: make the 'mq' policy an alias for 'smq'

smq seems to be performing better than the old mq policy in all
situations, as well as using a quarter of the memory.

Make 'mq' an alias for 'smq' when choosing a cache policy. The tunables
that were present for the old mq are faked, and have no effect. mq
should be considered deprecated now.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# a739ff3f 03-Dec-2015 Sami Tolvanen <samitolvanen@google.com>

dm verity: add support for forward error correction

Add support for correcting corrupted blocks using Reed-Solomon.

This code uses RS(255, N) interleaved across data and hash
blocks. Each error-correcting block covers N bytes evenly
distributed across the combined total data, so that each byte is a
maximum distance away from the others. This makes it possible to
recover from several consecutive corrupted blocks with relatively
small space overhead.

In addition, using verity hashes to locate erasures nearly doubles
the effectiveness of error correction. Being able to detect
corrupted blocks also improves performance, because only corrupted
blocks need to corrected.

For a 2 GiB partition, RS(255, 253) (two parity bytes for each
253-byte block) can correct up to 16 MiB of consecutive corrupted
blocks if erasures can be located, and 8 MiB if they cannot, with
16 MiB space overhead.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 86bad0c7 23-Nov-2015 Mikulas Patocka <mpatocka@redhat.com>

dm bufio: store stacktrace in buffers to help find buffer leaks

The option DM_DEBUG_BLOCK_STACK_TRACING is moved from persistent-data
directory to device mapper directory because it will now be used by
persistent-data and bufio. When the option is enabled, each bufio buffer
stores the stacktrace of the last dm_bufio_get(), dm_bufio_read() or
dm_bufio_new() call that increased the hold count to 1. The buffer's
stacktrace is printed if the buffer was not released before the bufio
client is destroyed.

When DM_DEBUG_BLOCK_STACK_TRACING is enabled, any bufio buffer leaks are
considered warnings - i.e. the kernel continues afterwards. If not
enabled, buffer leaks are considered BUGs and the kernel with crash.
Reasoning on this disposition is: if we only ever warned on buffer leaks
users would generally ignore them and the problematic code would never
get fixed.

Successfully used to find source of bufio leaks fixed with commit
fce079f63c3 ("dm btree: fix bufio buffer leaks in dm_btree_del() error
path").

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 14f09e2f 03-Nov-2015 Arnd Bergmann <arnd@arndb.de>

raid5-cache: add crc32c Kconfig dependency

The recent change of the raid5-cache code to use crc32c instead
of crc32 causes link errors when CONFIG_LIBCRC32C is disabled:

drivers/built-in.o: In function crc32c'
core.c:(.text+0x1c6060): undefined reference to `crc32c'

This adds an explicit 'select' statement like all other users
of this function do.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 5cb2fbd6ea0d ("raid5-cache: use crc32c checksum")
Signed-off-by: NeilBrown <neilb@suse.com>


# 294ab783 09-Sep-2015 Christoph Hellwig <hch@lst.de>

scsi_dh: fix randconfig build error

It looks like the Kconfig check that was meant to fix this (commit
fe9233fb6914a0eb20166c967e3020f7f0fba2c9 [SCSI] scsi_dh: fix kconfig related
build errors) was actually reversed, but no-one noticed until the new set of
patches which separated DM and SCSI_DH).

Fixes: fe9233fb6914a0eb20166c967e3020f7f0fba2c9
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>


# 57d42487 06-Jul-2015 Geert Uytterhoeven <geert@linux-m68k.org>

dm: Spelling s/consitent/consistent/

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>


# 6ed443c0 05-Jul-2015 Baruch Siach <baruch@tkos.co.il>

dm crypt: update wiki page URL

Cryptsetup moved to gitlab. This is a leftover from commit e44f23b32dc7
(dm crypt: update URLs to new cryptsetup project page, 2015-04-05).

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 66a63635 15-May-2015 Joe Thornber <ejt@redhat.com>

dm cache: add stochastic-multi-queue (smq) policy

The stochastic-multi-queue (smq) policy addresses some of the problems
with the current multiqueue (mq) policy.

Memory usage
------------

The mq policy uses a lot of memory; 88 bytes per cache block on a 64
bit machine.

SMQ uses 28bit indexes to implement it's data structures rather than
pointers. It avoids storing an explicit hit count for each block. It
has a 'hotspot' queue rather than a pre cache which uses a quarter of
the entries (each hotspot block covers a larger area than a single
cache block).

All these mean smq uses ~25bytes per cache block. Still a lot of
memory, but a substantial improvement nontheless.

Level balancing
---------------

MQ places entries in different levels of the multiqueue structures
based on their hit count (~ln(hit count)). This means the bottom
levels generally have the most entries, and the top ones have very
few. Having unbalanced levels like this reduces the efficacy of the
multiqueue.

SMQ does not maintain a hit count, instead it swaps hit entries with
the least recently used entry from the level above. The over all
ordering being a side effect of this stochastic process. With this
scheme we can decide how many entries occupy each multiqueue level,
resulting in better promotion/demotion decisions.

Adaptability
------------

The MQ policy maintains a hit count for each cache block. For a
different block to get promoted to the cache it's hit count has to
exceed the lowest currently in the cache. This means it can take a
long time for the cache to adapt between varying IO patterns.
Periodically degrading the hit counts could help with this, but I
haven't found a nice general solution.

SMQ doesn't maintain hit counts, so a lot of this problem just goes
away. In addition it tracks performance of the hotspot queue, which
is used to decide which blocks to promote. If the hotspot queue is
performing badly then it starts moving entries more quickly between
levels. This lets it adapt to new IO patterns very quickly.

Performance
-----------

In my tests SMQ shows substantially better performance than MQ. Once
this matures a bit more I'm sure it'll become the default policy.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 0e9cebe7 20-Mar-2015 Josef Bacik <jbacik@fb.com>

dm: add log writes target

Introduce a new target that is meant for file system developers to test file
system integrity at particular points in the life of a file system. We capture
all write requests and associated data and log them to a separate device
for later replay. There is a userspace utility to do this replay. The
idea behind this is to give file system developers a tool to verify that
the file system is always consistent.

Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Zach Brown <zab@zabbo.net>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 17e149b8 11-Mar-2015 Mike Snitzer <snitzer@redhat.com>

dm: add 'use_blk_mq' module param and expose in per-device ro sysfs attr

Request-based DM's blk-mq support defaults to off; but a user can easily
change the default using the dm_mod.use_blk_mq module/boot option.

Also, you can check what mode a given request-based DM device is using
with: cat /sys/block/dm-X/dm/use_blk_mq

This change enabled further cleanup and reduced work (e.g. the
md->io_pool and md->rq_pool isn't created if using blk-mq).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 8e854e9c 07-Mar-2014 Goldwyn Rodrigues <rgoldwyn@suse.com>

Create a separate module for clustering support

Tagged as EXPERIMENTAL for now.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>


# cf352487 15-Dec-2014 Loic Pefferkorn <loic@loicp.eu>

dm crypt: update url in CONFIG_DM_CRYPT help text

Update the obsolete url in the CONFIG_DM_CRYPT help text.

Signed-off-by: Loic Pefferkorn <loic@loicp.eu>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 6341e62b 20-Dec-2014 Christoph Jaeger <cj@linux.com>

kconfig: use bool instead of boolean for type definition attributes

Support for keyword 'boolean' will be dropped later on.

No functional change.

Reference: http://lkml.kernel.org/r/cover.1418003065.git.cj@linux.com
Signed-off-by: Christoph Jaeger <cj@linux.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>


# 83fe27ea 05-Dec-2014 Pranith Kumar <bobby.prani@gmail.com>

rcu: Make SRCU optional by using CONFIG_SRCU

SRCU is not necessary to be compiled by default in all cases. For tinification
efforts not compiling SRCU unless necessary is desirable.

The current patch tries to make compiling SRCU optional by introducing a new
Kconfig option CONFIG_SRCU which is selected when any of the components making
use of SRCU are selected.

If we do not select CONFIG_SRCU, srcu.o will not be compiled at all.

text data bss dec hex filename
2007 0 0 2007 7d7 kernel/rcu/srcu.o

Size of arch/powerpc/boot/zImage changes from

text data bss dec hex filename
831552 64180 23944 919676 e087c arch/powerpc/boot/zImage : before
829504 64180 23952 917636 e0084 arch/powerpc/boot/zImage : after

so the savings are about ~2000 bytes.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: resolve conflict due to removal of arch/ia64/kvm/Kconfig. ]


# eec40579 03-Mar-2014 Joe Thornber <ejt@redhat.com>

dm: add era target

dm-era is a target that behaves similar to the linear target. In
addition it keeps track of which blocks were written within a user
defined period of time called an 'era'. Each era target instance
maintains the current era as a monotonically increasing 32-bit
counter.

Use cases include tracking changed blocks for backup software, and
partially invalidating the contents of a cache to restore cache
coherency after rolling back a vendor snapshot.

dm-era is primarily expected to be paired with the dm-cache target.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# c64d240d 03-Mar-2014 Mike Snitzer <snitzer@redhat.com>

dm: fix Kconfig indentation

Since DM_DEBUG_BLOCK_STACK_TRACING is a DM_PERSISTENT_DATA config option
move it from drivers/md/Kconfig to drivers/md/persistent-data/Kconfig.

Doing so fixes indentation for other DM config options.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 2995fa78 13-Jan-2014 Mikulas Patocka <mpatocka@redhat.com>

dm sysfs: fix a module unload race

This reverts commit be35f48610 ("dm: wait until embedded kobject is
released before destroying a device") and provides an improved fix.

The kobject release code that calls the completion must be placed in a
non-module file, otherwise there is a module unload race (if the process
calling dm_kobject_release is preempted and the DM module unloaded after
the completion is triggered, but before dm_kobject_release returns).

To fix this race, this patch moves the completion code to dm-builtin.c
which is always compiled directly into the kernel if BLK_DEV_DM is
selected.

The patch introduces a new dm_kobject_holder structure, its purpose is
to keep the completion and kobject in one place, so that it can be
accessed from non-module code without the need to export the layout of
struct mapped_device to that code.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org


# 55494bf2 13-Jan-2014 Mikulas Patocka <mpatocka@redhat.com>

dm snapshot: use dm-bufio

Use dm-bufio for initial loading of the exceptions.
Introduce a new function dm_bufio_forget that frees the given buffer.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 10343180 13-Dec-2013 Mike Snitzer <snitzer@redhat.com>

dm persistent data: cleanup dm-thin specific references in text

DM's persistent-data library is now used my multiple targets so
exclusive references to "pool" or "thin provisioning" need to be
cleaned up. Adjust Kconfig's DM_DEBUG_BLOCK_STACK_TRACING text
and remove "pool" from a block manager error message.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>


# 5442851e 08-Nov-2013 Mikulas Patocka <mpatocka@redhat.com>

dm: fix Kconfig menu indentation

The option DM_LOG_USERSPACE is sub-option of DM_MIRROR, so place it
right after DM_MIRROR. Doing so fixes various other Device mapper
targets/features to be properly nested under "Device mapper support".

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>


# 9d0eb0ab 10-Jul-2013 Jim Ramsay <jim_ramsay@dell.com>

dm: add switch target

dm-switch is a new target that maps IO to underlying block devices
efficiently when there is a large number of fixed-sized address regions
but there is no simple pattern to allow for a compact mapping
representation such as dm-stripe.

Though we have developed this target for a specific storage device, Dell
EqualLogic, we have made an effort to keep it as general purpose as
possible in the hope that others may benefit.

Originally developed by Jim Ramsay. Simplified by Mikulas Patocka.

Signed-off-by: Jim Ramsay <jim_ramsay@dell.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# cafe5635 23-Mar-2013 Kent Overstreet <koverstreet@google.com>

bcache: A block layer cache

Does writethrough and writeback caching, handles unclean shutdown, and
has a bunch of other nifty features motivated by real world usage.

See the wiki at http://bcache.evilpiepirate.org for more.

Signed-off-by: Kent Overstreet <koverstreet@google.com>


# 8735a813 01-Mar-2013 Heinz Mauelshagen <mauelshagen@redhat.com>

dm cache: add cleaner policy

A simple cache policy that writes back all data to the origin.

This is used to decommission a dm cache by emptying it.

Signed-off-by: Heinz Mauelshagen <mauelshagen@redhat.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# f2836352 01-Mar-2013 Joe Thornber <ejt@redhat.com>

dm cache: add mq policy

A cache policy that uses a multiqueue ordered by recent hit
count to select which blocks should be promoted and demoted.
This is meant to be a general purpose policy. It prioritises
reads over writes.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# c6b4fcba 01-Mar-2013 Joe Thornber <ejt@redhat.com>

dm: add cache target

Add a target that allows a fast device such as an SSD to be used as a
cache for a slower device such as a disk.

A plug-in architecture was chosen so that the decisions about which data
to migrate and when are delegated to interchangeable tunable policy
modules. The first general purpose module we have developed, called
"mq" (multiqueue), follows in the next patch. Other modules are
under development.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Heinz Mauelshagen <mauelshagen@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# d57916a0 01-Mar-2013 Alasdair G Kergon <agk@redhat.com>

dm: remove CONFIG_EXPERIMENTAL

Remove EXPERIMENTAL from all existing device-mapper targets.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 51acbcec 27-Feb-2013 NeilBrown <neilb@suse.de>

md: remove CONFIG_MULTICORE_RAID456

This doesn't seem to actually help and we have an alternate
multi-threading approach waiting in the wings, so just get
rid of this config option and associated code.

As a bonus, we remove one use of CONFIG_EXPERIMENTAL

Cc: Dan Williams <djbw@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: NeilBrown <neilb@suse.de>


# 4f81a417 12-Oct-2012 Mike Snitzer <snitzer@redhat.com>

dm thin: move bio_prison code to separate module

The bio prison code will be useful to other future DM targets so
move it to a separate module.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# d9f691c3 01-Aug-2012 NeilBrown <neilb@suse.de>

md/dm-raid: DM_RAID should select MD_RAID10

Now that DM_RAID supports raid10, it needs to select that code
to ensure it is included.

Cc: Jonathan Brassow <jbrassow@redhat.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>


# 3caf6d73 27-Jul-2012 Joe Thornber <ejt@redhat.com>

dm persistent data: remove debug space map checker

Remove debug space map checker from dm persistent data.

The space map checker is a wrapper for other space maps that double
checks the reference counts are correct. It holds all these reference
counts in memory rather than on disk, so uses a lot of memory and is
thus restricted to small pools.

As yet, this checker hasn't found any issues, but has caused a few of
its own due to people turning it on by default with larger pools.

Removing.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# a4ffc152 28-Mar-2012 Mikulas Patocka <mpatocka@redhat.com>

dm: add verity target

This device-mapper target creates a read-only device that transparently
validates the data on one underlying device against a pre-generated tree
of cryptographic checksums stored on a second device.

Two checksum device formats are supported: version 0 which is already
shipping in Chromium OS and version 1 which incorporates some
improvements.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Elly Jones <ellyjones@chromium.org>
Cc: Milan Broz <mbroz@redhat.com>
Cc: Olof Johansson <olofj@chromium.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 035220b3 28-Mar-2012 Alasdair G Kergon <agk@redhat.com>

dm raid: no longer experimental

The dm raid module (using md) is becoming the preferred way of creating long-lived
mirrors through userspace LVM so remove the EXPERIMENTAL tag.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# e0b215da 28-Mar-2012 Alasdair G Kergon <agk@redhat.com>

dm uevent: no longer experimental

Drop EXPERIMENTAL tag from dm-uevent.

It's not changed for a while and some userspace tools are relying upon it.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 991d9fa0 31-Oct-2011 Joe Thornber <thornber@redhat.com>

dm: add thin provisioning target

Initial EXPERIMENTAL implementation of device-mapper thin provisioning
with snapshot support. The 'thin' target is used to create instances of
the virtual devices that are hosted in the 'thin-pool' target. The
thin-pool target provides data sharing among devices. This sharing is
made possible using the persistent-data library in the previous patch.

The main highlight of this implementation, compared to the previous
implementation of snapshots, is that it allows many virtual devices to
be stored on the same data volume, simplifying administration and
allowing sharing of data between volumes (thus reducing disk usage).

Another big feature is support for arbitrary depth of recursive
snapshots (snapshots of snapshots of snapshots ...). The previous
implementation of snapshots did this by chaining together lookup tables,
and so performance was O(depth). This new implementation uses a single
data structure so we don't get this degradation with depth.

For further information and examples of how to use this, please read
Documentation/device-mapper/thin-provisioning.txt

Signed-off-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 95d402f0 31-Oct-2011 Mikulas Patocka <mpatocka@redhat.com>

dm: add bufio

The dm-bufio interface allows you to do cached I/O on devices,
holding recently-read blocks in memory and performing delayed writes.

We don't use buffer cache or page cache already present in the kernel, because:
* we need to handle block sizes larger than a page
* we can't allocate memory to perform reads or we'd have deadlocks

Currently, when a cache is required, we limit its size to a fraction of
available memory. Usage can be viewed and changed in
/sys/module/dm_bufio/parameters/ .

The first user is thin provisioning, but more dm users are planned.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# b12d437b 01-Aug-2011 Jonathan Brassow <jbrassow@redhat.com>

dm raid: support metadata devices

Add the ability to parse and use metadata devices to dm-raid. Although
not strictly required, without the metadata devices, many features of
RAID are unavailable. They are used to store a superblock and bitmap.

The role, or position in the array, of each device must be recorded in
its superblock. This is to help with fault handling, array reshaping,
and sanity checks. RAID 4/5/6 devices must be loaded in a specific order:
in this way, the 'array_position' field helps validate the correctness
of the mapping when it is loaded. It can be used during reshaping to
identify which devices are added/removed. Fault handling is impossible
without this field. For example, when a device fails it is recorded in
the superblock. If this is a RAID1 device and the offending device is
removed from the array, there must be a way during subsequent array
assembly to determine that the failed device was the one removed. This
is done by correlating the 'array_position' field and the bit-field
variable 'failed_devices'.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 3407ef52 24-Mar-2011 Josef Bacik <josef@redhat.com>

dm: add flakey target

This target is the same as the linear target except that it returns I/O
errors periodically. It's been found useful in simulating failing
devices for testing purposes.

I needed a dm target to do some failure testing on btrfs's raid code, and
Mike pointed me at this.

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 9d09e663 13-Jan-2011 NeilBrown <neilb@suse.de>

dm: raid456 basic support

This patch is the skeleton for the DM target that will be
the bridge from DM to MD (initially RAID456 and later RAID1). It
provides a way to use device-mapper interfaces to the MD RAID456
drivers.

As with all device-mapper targets, the nominal public interfaces are the
constructor (CTR) tables and the status outputs (both STATUSTYPE_INFO
and STATUSTYPE_TABLE). The CTR table looks like the following:

1: <s> <l> raid \
2: <raid_type> <#raid_params> <raid_params> \
3: <#raid_devs> <meta_dev1> <dev1> .. <meta_devN> <devN>

Line 1 contains the standard first three arguments to any device-mapper
target - the start, length, and target type fields. The target type in
this case is "raid".

Line 2 contains the arguments that define the particular raid
type/personality/level, the required arguments for that raid type, and
any optional arguments. Possible raid types include: raid4, raid5_la,
raid5_ls, raid5_rs, raid6_zr, raid6_nr, and raid6_nc. (again, raid1 is
planned for the future.) The list of required and optional parameters
is the same for all the current raid types. The required parameters are
positional, while the optional parameters are given as key/value pairs.
The possible parameters are as follows:
<chunk_size> Chunk size in sectors.
[[no]sync] Force/Prevent RAID initialization
[rebuild <idx>] Rebuild the drive indicated by the index
[daemon_sleep <ms>] Time between bitmap daemon work to clear bits
[min_recovery_rate <kB/sec/disk>] Throttle RAID initialization
[max_recovery_rate <kB/sec/disk>] Throttle RAID initialization
[max_write_behind <value>] See '-write-behind=' (man mdadm)
[stripe_cache <sectors>] Stripe cache size for higher RAIDs

Line 3 contains the list of devices that compose the array in
metadata/data device pairs. If the metadata is stored separately, a '-'
is given for the metadata device position. If a drive has failed or is
missing at creation time, a '-' can be given for both the metadata and
data drives for a given position.

Examples:
# RAID4 - 4 data drives, 1 parity
# No metadata devices specified to hold superblock/bitmap info
# Chunk size of 1MiB
# (Lines separated for easy reading)
0 1960893648 raid \
raid4 1 2048 \
5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81

# RAID4 - 4 data drives, 1 parity (no metadata devices)
# Chunk size of 1MiB, force RAID initialization,
# min recovery rate at 20 kiB/sec/disk
0 1960893648 raid \
raid4 4 2048 min_recovery_rate 20 sync\
5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81

Performing a 'dmsetup table' should display the CTR table used to
construct the mapping (with possible reordering of optional
parameters).

Performing a 'dmsetup status' will yield information on the state and
health of the array. The output is as follows:
1: <s> <l> raid \
2: <raid_type> <#devices> <1 health char for each dev> <resync_ratio>

Line 1 is standard DM output. Line 2 is best shown by example:
0 1960893648 raid raid4 5 AAAAA 2/490221568
Here we can see the RAID type is raid4, there are 5 devices - all of
which are 'A'live, and the array is 2/490221568 complete with recovery.

Cc: linux-raid@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 08fb730c 02-May-2010 NeilBrown <neilb@suse.de>

md: remove EXPERIMENTAL designation from RAID10

RAID10 has been available for quite a while now and is quite well
tested, so we can remove the EXPERIMENTAL designation.

Reported-by: Eric MSP Veith <eveith@wwweb-library.net>
Signed-off-by: NeilBrown <neilb@suse.de>


# 93bd89a6 13-Dec-2009 NeilBrown <neilb@suse.de>

md: revise Kconfig help for MD_MULTIPATH

Make it clear in the config message that MD_MULTIPATH is not under
active development.

Cc: Oren Held <orenhe@il.ibm.com>
Signed-off-by: NeilBrown <neilb@suse.de>


# e5d84970 29-Oct-2009 David Woodhouse <David.Woodhouse@intel.com>

async_tx: Move ASYNC_RAID6_TEST option to crypto/async_tx/, fix dependencies

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>


# f5e70d0f 13-Jul-2009 David Woodhouse <dwmw2@tylersburg.infradead.org>

md: Factor out RAID6 algorithms into lib/

We'll want to use these in btrfs too.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>


# 07a3b417 29-Aug-2009 Dan Williams <dan.j.williams@intel.com>

md/raid456: distribute raid processing over multiple cores

Now that the resources to handle stripe_head operations are allocated
percpu it is possible for raid5d to distribute stripe handling over
multiple cores. This conversion also adds a call to cond_resched() in
the non-multicore case to prevent one core from getting monopolized for
raid operations.

Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>


# ac6b53b6 14-Jul-2009 Dan Williams <dan.j.williams@intel.com>

md/raid6: asynchronous raid6 operations

[ Based on an original patch by Yuri Tikhonov ]

The raid_run_ops routine uses the asynchronous offload api and
the stripe_operations member of a stripe_head to carry out xor+pq+copy
operations asynchronously, outside the lock.

The operations performed by RAID-6 are the same as in the RAID-5 case
except for no support of STRIPE_OP_PREXOR operations. All the others
are supported:
STRIPE_OP_BIOFILL
- copy data into request buffers to satisfy a read request
STRIPE_OP_COMPUTE_BLK
- generate missing blocks (1 or 2) in the cache from the other blocks
STRIPE_OP_BIODRAIN
- copy data out of request buffers to satisfy a write request
STRIPE_OP_RECONSTRUCT
- recalculate parity for new data that has entered the cache
STRIPE_OP_CHECK
- verify that the parity is correct

The flow is the same as in the RAID-5 case, and reuses some routines, namely:
1/ ops_complete_postxor (renamed to ops_complete_reconstruct)
2/ ops_complete_compute (updated to set up to 2 targets uptodate)
3/ ops_run_check (renamed to ops_run_check_p for xor parity checks)

[neilb@suse.de: fixes to get it to pass mdadm regression suite]
Reviewed-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>


# cb3c8299 14-Jul-2009 Dan Williams <dan.j.williams@intel.com>

async_tx: raid6 recovery self test

Port drivers/md/raid6test/test.c to use the async raid6 recovery
routines. This is meant as a unit test for raid6 acceleration drivers. In
addition to the 16-drive test case this implements tests for the 4-disk and
5-disk special cases (dma devices can not generically handle less than 2
sources), and adds a test for the D+Q case.

Reviewed-by: Andre Noll <maan@systemlinux.org>
Acked-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>


# f5db4af4 22-Jun-2009 Jonthan Brassow <jbrassow@redhat.com>

dm raid1: add userspace log

This patch contains a device-mapper mirror log module that forwards
requests to userspace for processing.

The structures used for communication between kernel and userspace are
located in include/linux/dm-log-userspace.h. Due to the frequency,
diversity, and 2-way communication nature of the exchanges between
kernel and userspace, 'connector' was chosen as the interface for
communication.

The first log implementations written in userspace - "clustered-disk"
and "clustered-core" - support clustered shared storage. A userspace
daemon (in the LVM2 source code repository) uses openAIS/corosync to
process requests in an ordered fashion with the rest of the nodes in the
cluster so as to prevent log state corruption. Other implementations
with no association to LVM or openAIS/corosync, are certainly possible.

(Imagine if two machines are writing to the same region of a mirror.
They would both mark the region dirty, but you need a cluster-aware
entity that can handle properly marking the region clean when they are
done. Otherwise, you might clear the region when the first machine is
done, not the second.)

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# f392ba88 22-Jun-2009 Kiyoshi Ueda <k-ueda@ct.jp.nec.com>

dm mpath: add service time load balancer

This patch adds a service time oriented dynamic load balancer,
dm-service-time, which selects the path with the shortest estimated
service time for the incoming I/O.
The service time is estimated by dividing the in-flight I/O size
by a performance value of each path.

The performance value can be given as a table argument at the table
loading time. If no performance value is given, all paths are
considered equal.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# fd5e0339 22-Jun-2009 Kiyoshi Ueda <k-ueda@ct.jp.nec.com>

dm mpath: add queue length load balancer

This patch adds a dynamic load balancer, dm-queue-length, which
balances the number of in-flight I/Os across the paths.

The code is based on the patch posted by Stefan Bader:
https://www.redhat.com/archives/dm-devel/2005-October/msg00050.html

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 2cffc4a0 30-Mar-2009 NeilBrown <neilb@suse.de>

md: remove CONFIG_MD_RAID_RESHAPE config option.

This was only needed when the code was experimental. Most of it
is well tested now, so the option is no longer useful.

Signed-off-by: NeilBrown <neilb@suse.de>


# f701d589 30-Mar-2009 Dan Williams <dan.j.williams@intel.com>

md/raid6: move raid6 data processing to raid6_pq.ko

Move the raid6 data processing routines into a standalone module
(raid6_pq) to prepare them to be called from async_tx wrappers and other
non-md drivers/modules. This precludes a circular dependency of raid456
needing the async modules for data processing while those modules in
turn depend on raid456 for the base level synchronous raid6 routines.

To support this move:
1/ The exportable definitions in raid6.h move to include/linux/raid/pq.h
2/ The raid6_call, recovery calls, and table symbols are exported
3/ Extra #ifdef __KERNEL__ statements to enable the userspace raid6test to
compile

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>


# ce52aebd 10-Oct-2008 Alan Jenkins <alan-jenkins@tuffmail.co.uk>

raid, fastboot: hide RAID autodetect option if MD is compiled as a module

Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


# a364092a 21-Sep-2008 Arjan van de Ven <arjan@linux.intel.com>

raid: make RAID autodetect default a KConfig option

RAID autodetect has the side effect of requiring synchronisation
of all device drivers, which can make the boot several seconds longer
(I've measured 7 on one of my laptops).... even for systems that don't
have RAID setup for the root filesystem (the only FS where this matters).

This patch makes the default for autodetect a config option; either way
the user can always override via the kernel command line.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: NeilBrown <neilb@suse.de>


# fe9233fb 23-May-2008 Chandra Seetharaman <sekharan@us.ibm.com>

[SCSI] scsi_dh: fix kconfig related build errors

Do not automatically "select" SCSI_DH for dm-multipath. If SCSI_DH
doesn't exist,just do not allow hardware handlers to be used.

Handle SCSI_DH being a module also. Make sure it doesn't allow DM_MULTIPATH
to be compiled in when SCSI_DH is a module.

[jejb: added comment for Kconfig syntax]
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>


# cb520223 01-May-2008 Chandra Seetharaman <sekharan@us.ibm.com>

[SCSI] scsi_dh: Remove hardware handlers from dm

This patch removes the 3 hardware handlers that currently exist
under dm as the functionality is moved to SCSI layer in the earlier
patches.

[jejb: removed more makefile hunks and rejection fixes]
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>


# cfae5c9b 01-May-2008 Chandra Seetharaman <sekharan@us.ibm.com>

[SCSI] scsi_dh: Use SCSI device handler in dm-multipath

This patch converts dm-mpath to use scsi device handlers instead of
dm's hardware handlers.

This patch does not add any new functionality. Old behaviors remain and
userspace tools work as is except that arguments supplied with hardware
handler are ignored.

One behavioral exception is: Activation of a path is synchronous in this
patch, opposed to the older behavior of being asynchronous (changed in
patch 07: scsi_dh: Add a single threaded workqueue for initializing a path)

Note: There is no need to get a reference for the device handler module
(as it was done in the dm hardware handler case) here as the reference
is held when the device was first found. Instead we check and make sure
that support for the specified device is present at table load time.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>


# 0149e57f 07-Feb-2008 Alasdair G Kergon <agk@redhat.com>

dm: targets no longer experimental

Drop the EXPERIMENTAL tag from well-established device-mapper targets, so
the newer ones stand out better.

Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# d1622e89 13-Dec-2007 Paul Mundt <lethal@linux-sh.org>

dm mpath: hp requires scsi

With CONFIG_SCSI=n __scsi_print_sense() is never linked in.

drivers/built-in.o: In function `hp_sw_end_io':
dm-mpath-hp-sw.c:(.text+0x914f8): undefined reference to `__scsi_print_sense'

Caught with a randconfig on current git.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 51e5b2bd 19-Oct-2007 Mike Anderson <andmike@linux.vnet.ibm.com>

dm: add uevent to core

This patch adds a uevent skeleton to device-mapper.

Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 16ebbf35 19-Oct-2007 Dave Wysochanski <dwysocha@redhat.com>

dm mpath: add hp handler

This patch adds the most basic dm-multipath hardware support for the
HP active/passive arrays.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Acked-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>


# 6e106b0d 24-Aug-2007 Randy Dunlap <randy.dunlap@oracle.com>

DM_MULTIPATH_RDAC: "scsi_normalize_sense" undefined

DM_MULTIPATH_RDAC uses SCSI API(s) and is for a SCSI device,
so add SCSI to its depends on to prevent build errors.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
[ Tested and Verified by Chandra Seetharaman ]
Acked-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# afd44034 17-Jul-2007 Jan Engelhardt <jengelh@linux01.gwdg.de>

Use menuconfig objects II - MD

Change Kconfig objects from "menu, config" into "menuconfig" so
that the user can disable the whole feature without having to
enter the menu first.

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 9bc89cd8 02-Jan-2007 Dan Williams <dan.j.williams@intel.com>

async_tx: add the async_tx api

The async_tx api provides methods for describing a chain of asynchronous
bulk memory transfers/transforms with support for inter-transactional
dependencies. It is implemented as a dmaengine client that smooths over
the details of different hardware offload engine implementations. Code
that is written to the api can optimize for asynchronous operation and the
api will fit the chain of operations to the available offload resources.

I imagine that any piece of ADMA hardware would register with the
'async_*' subsystem, and a call to async_X would be routed as
appropriate, or be run in-line. - Neil Brown

async_tx exploits the capabilities of struct dma_async_tx_descriptor to
provide an api of the following general format:

struct dma_async_tx_descriptor *
async_<operation>(..., struct dma_async_tx_descriptor *depend_tx,
dma_async_tx_callback cb_fn, void *cb_param)
{
struct dma_chan *chan = async_tx_find_channel(depend_tx, <operation>);
struct dma_device *device = chan ? chan->device : NULL;
int int_en = cb_fn ? 1 : 0;
struct dma_async_tx_descriptor *tx = device ?
device->device_prep_dma_<operation>(chan, len, int_en) : NULL;

if (tx) { /* run <operation> asynchronously */
...
tx->tx_set_dest(addr, tx, index);
...
tx->tx_set_src(addr, tx, index);
...
async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
} else { /* run <operation> synchronously */
...
<operation>
...
async_tx_sync_epilog(flags, depend_tx, cb_fn, cb_param);
}

return tx;
}

async_tx_find_channel() returns a capable channel from its pool. The
channel pool is organized as a per-cpu array of channel pointers. The
async_tx_rebalance() routine is tasked with managing these arrays. In the
uniprocessor case async_tx_rebalance() tries to spread responsibility
evenly over channels of similar capabilities. For example if there are two
copy+xor channels, one will handle copy operations and the other will
handle xor. In the SMP case async_tx_rebalance() attempts to spread the
operations evenly over the cpus, e.g. cpu0 gets copy channel0 and xor
channel0 while cpu1 gets copy channel 1 and xor channel 1. When a
dependency is specified async_tx_find_channel defaults to keeping the
operation on the same channel. A xor->copy->xor chain will stay on one
channel if it supports both operation types, otherwise the transaction will
transition between a copy and a xor resource.

Currently the raid5 implementation in the MD raid456 driver has been
converted to the async_tx api. A driver for the offload engines on the
Intel Xscale series of I/O processors, iop-adma, is provided in a later
commit. With the iop-adma driver and async_tx, raid456 is able to offload
copy, xor, and xor-zero-sum operations to hardware engines.

On iop342 tiobench showed higher throughput for sequential writes (20 - 30%
improvement) and sequential reads to a degraded array (40 - 55%
improvement). For the other cases performance was roughly equal, +/- a few
percentage points. On a x86-smp platform the performance of the async_tx
implementation (in synchronous mode) was also +/- a few percentage points
of the original implementation. According to 'top' on iop342 CPU
utilization drops from ~50% to ~15% during a 'resync' while the speed
according to /proc/mdstat doubles from ~25 MB/s to ~50 MB/s.

The tiobench command line used for testing was: tiobench --size 2048
--block 4096 --block 131072 --dir /mnt/raid --numruns 5
* iop342 had 1GB of memory available

Details:
* if CONFIG_DMA_ENGINE=n the asynchronous path is compiled away by making
async_tx_find_channel a static inline routine that always returns NULL
* when a callback is specified for a given transaction an interrupt will
fire at operation completion time and the callback will occur in a
tasklet. if the the channel does not support interrupts then a live
polling wait will be performed
* the api is written as a dmaengine client that requests all available
channels
* In support of dependencies the api implicitly schedules channel-switch
interrupts. The interrupt triggers the cleanup tasklet which causes
pending operations to be scheduled on the next channel
* Xor engines treat an xor destination address differently than a software
xor routine. To the software routine the destination address is an implied
source, whereas engines treat it as a write-only destination. This patch
modifies the xor_blocks routine to take a an explicit destination address
to mirror the hardware.

Changelog:
* fixed a leftover debug print
* don't allow callbacks in async_interrupt_cond
* fixed xor_block changes
* fixed usage of ASYNC_TX_XOR_DROP_DEST
* drop dma mapping methods, suggested by Chris Leech
* printk warning fixups from Andrew Morton
* don't use inline in C files, Adrian Bunk
* select the API when MD is enabled
* BUG_ON xor source counts <= 1
* implicitly handle hardware concerns like channel switching and
interrupts, Neil Brown
* remove the per operation type list, and distribute operation capabilities
evenly amongst the available channels
* simplify async_tx_find_channel to optimize the fast path
* introduce the channel_table_initialized flag to prevent early calls to
the api
* reorganize the code to mimic crypto
* include mm.h as not all archs include it in dma-mapping.h
* make the Kconfig options non-user visible, Adrian Bunk
* move async_tx under crypto since it is meant as 'core' functionality, and
the two may share algorithms in the future
* move large inline functions into c files
* checkpatch.pl fixes
* gpl v2 only correction

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>


# 685784aa 09-Jul-2007 Dan Williams <dan.j.williams@intel.com>

xor: make 'xor_blocks' a library routine for use with async_tx

The async_tx api tries to use a dma engine for an operation, but will fall
back to an optimized software routine otherwise. Xor support is
implemented using the raid5 xor routines. For organizational purposes this
routine is moved to a common area.

The following fixes are also made:
* rename xor_block => xor_blocks, suggested by Adrian Bunk
* ensure that xor.o initializes before md.o in the built-in case
* checkpatch.pl fixes
* mark calibrate_xor_blocks __init, Adrian Bunk

Cc: Adrian Bunk <bunk@stusta.de>
Cc: NeilBrown <neilb@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>


# dd172d72 12-Jul-2007 Chandra Seetharaman <sekharan@us.ibm.com>

dm mpath: rdac

This patch supports LSI/Engenio devices in RDAC mode. Like dm-emc
it requires userspace support. In your multipath.conf file you must have:

path_checker rdac
hardware_handler "1 rdac"
prio_callout "/sbin/mpath_prio_tpc /dev/%n"

And you also then must have a updated multipath tools release which
has rdac support.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 26b9f228 09-May-2007 Heinz Mauelshagen <mauelshagen@redhat.com>

dm: delay target

New device-mapper target that can delay I/O (for testing). Reads can be
separated from writes, redirected to different underlying devices and delayed
by differing amounts of time.

Signed-off-by: Heinz Mauelshagen <mauelshagen@redhat.com>
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 3263263f 09-Dec-2006 Herbert Xu <herbert@gondor.apana.org.au>

[CRYPTO] dm-crypt: Select CRYPTO_CBC

As CBC is the default chaining method for cryptoloop, we should select
it from cryptoloop to ease the transition. Spotted by Rene Herman.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 14f50b49 03-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] md: remove 'experimental' classification from raid5 reshape

I have had enough success reports not to believe that this is safe for 2.6.19.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# cc109201 03-Oct-2006 Bryn Reeves <breeves@redhat.com>

[PATCH] dm: add debug macro

Add CONFIG_DM_DEBUG and DMDEBUG() macro.

Signed-off-by: Bryn Reeves <breeves@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 9361401e 30-Sep-2006 David Howells <dhowells@redhat.com>

[PATCH] BLOCK: Make it possible to disable the block layer [try #6]

Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.

This patch does the following:

(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.

(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:

(*) Block I/O tracing.

(*) Disk partition code.

(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.

(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.

(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.

(*) MTD blockdev handling and FTL.

(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.

(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.

(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.

(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.

(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.

(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.

(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:

(*) Default blockdev file operations (to give error ENODEV on opening).

(*) Makes some /proc changes:

(*) /proc/devices does not list any blockdevs.

(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.

(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.

(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.

(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.

(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).

(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>


# b3cc9ec7 26-Jun-2006 akpm@osdl.org <akpm@osdl.org>

[PATCH] md: Fix Kconfig error

RAID5 recently changed to RAID456

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 4d2554d0 26-Jun-2006 Justin Piszcz <jpiszcz@lucidpixels.com>

[PATCH] md: md Kconfig speeling feex

I was experimenting with Linux SW raid today and found a spelling error when
reading the help menus... (and fly spell found more).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 16a53ecc 26-Jun-2006 NeilBrown <neilb@suse.de>

[PATCH] md: merge raid5 and raid6 code

There is a lot of commonality between raid5.c and raid6main.c. This patches
merges both into one module called raid456. This saves a lot of code, and
paves the way for online raid5->raid6 migrations.

There is still duplication, e.g. between handle_stripe5 and handle_stripe6.
This will probably be cleaned up later.

Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 6f91fe88 10-Apr-2006 NeilBrown <neilb@suse.de>

[PATCH] md: make sure 64bit fields in version-1 metadata are 64-bit aligned

reshape_position is a 64bit field that was not 64bit aligned. So swap with
new_level.

NOTE: this is a user-visible change. However:
- The bad code has not appeared in a released kernel
- This code is still marked 'experimental'
- This only affects version-1 superblock, which are not in wide use
- These field are only used (rather than simply reported) by user-space
tools in extemely rare circumstances : after a reshape crashes in the
first second of the reshape process.

So I believe that, at this stage, the change is safe. Especially if people
heed the 'help' message on use mdadm-2.4.1.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 29269553 27-Mar-2006 NeilBrown <neilb@suse.de>

[PATCH] md: Final stages of raid5 expand code

This patch adds raid5_reshape and end_reshape which will start and finish the
reshape processes.

raid5_reshape is only enabled in CONFIG_MD_RAID5_RESHAPE is set, to discourage
accidental use.

Read the 'help' for the CONFIG_MD_RAID5_RESHAPE entry.

and Make sure that you have backups, just in case.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 1da177e4 16-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org>

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!