History log of /freebsd-current/sys/geom/multipath/g_multipath.c
Revision Date Author Comments
# fdafd315 24-Nov-2023 Warner Losh <imp@FreeBSD.org>

sys: Automated cleanup of cdefs and other formatting

Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by: Netflix


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 10ae42cc 29-Jan-2022 Alexander Motin <mav@FreeBSD.org>

GEOM: Set G_CF_DIRECT_SEND/RECEIVE for taste consumers.

All I/O requests through the taste consumers are synchronous, done
with g_read_data() and without any locks held. It makes no sense
to delegate the I/O to g_down/g_up threads.

This removes many of context switches during disk retaste.

MFC after: 2 weeks


# b74fdaaf 25-Nov-2021 Mateusz Guzik <mjg@FreeBSD.org>

geom_multipath: plug set-but-not-used vars

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 420dbe76 22-Apr-2021 Alan Somers <asomers@FreeBSD.org>

gmultipath: make physpath distinct from the underlying providers'

zfsd uses a device's physical path attribute to automatically replace a
missing ZFS disk when a blank disk is inserted into the same physical
slot. Currently gmultipath passes through its underlying providers'
physical path attribute. That may cause zfsd to replace a missing
gmultipath provider with a newly arrived, single-path disk. That would
be bad.

This commit fixes that problem by simply appending "/mp" to the
underlying providers' physical path, in a manner similar to what geli
already does.

Sponsored by: Axcient
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D29941


# d22ff249 18-Oct-2020 Edward Tomasz Napierala <trasz@FreeBSD.org>

Make g_attach() return ENXIO for orphaned providers; update various
classes to add missing error checking.

Reviewed by: imp
MFC after: 2 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26658


# d40bc607 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

geom: clean up empty lines in .c and .h files


# 8510f61a 08-Jul-2020 Xin LI <delphij@FreeBSD.org>

sys/geom: consistently use _PATH_DEV instead of hardcoding "/dev/".

Reviewed by: cem
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D25565


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 67f72211 05-Dec-2019 Alan Somers <asomers@FreeBSD.org>

gmultipath: add ATF tests

Add ATF tests for most gmultipath operations. Add some dtrace probes too,
primarily for configuration changes that happen in response to provider
errors.

PR: 178473
MFC after: 2 weeks
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D22235


# 49ee0fce 19-Jun-2019 Alexander Motin <mav@FreeBSD.org>

Use sbuf_cat() in GEOM confxml generation.

When it comes to megabytes of text, difference between sbuf_printf() and
sbuf_cat() becomes substantial.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# 74d6c131 10-Apr-2018 Kyle Evans <kevans@FreeBSD.org>

Annotate geom modules with MODULE_VERSION

GEOM ELI may double ask the password during boot. Once at loader time, and
once at init time.

This happens due a module loading bug. By default GEOM ELI caches the
password in the kernel, but without the MODULE_VERSION annotation, the
kernel loads over the kernel module, even if the GEOM ELI was compiled into
the kernel. In this case, the newly loaded module
purges/invalidates/overwrites the GEOM ELI's password cache, which causes
the double asking.

MFC Note: There's a pc98 component to the original submission that is
omitted here due to pc98 removal in head. This part will need to be revived
upon MFC.

Reviewed by: imp
Submitted by: op
Obtained from: opBSD
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14992


# 3728855a 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/geom: adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# d3fef0a0 20-Jan-2017 Alexander Motin <mav@FreeBSD.org>

Report disk addition errors on `add` or `create` subcommand.

MFC after: 1 week


# 80f0a89c 12-Nov-2016 Alexander Motin <mav@FreeBSD.org>

Do not report error on close even if we have no paths left.

MFC after: 2 weeks


# 25080ac4 15-Dec-2015 Steven Hartland <smh@FreeBSD.org>

Prevent g_access calls to bad multipath members

When a multipath member is orphaned its access members are zeroed before its
removed if marked for wither, so prevent any future calls to g_access on
such members.

This prevents a panic on debug kernels which validates the resultant values
aren't negative.

Reviewed by: mav
MFC after: 2 weeks
Sponsored by: Multiplay
Differential Revision: https://reviews.freebsd.org/D4416


# 0ada3afc 09-Apr-2015 Alexander Motin <mav@FreeBSD.org>

Remove sleeps from geom_up thread on device destruction.

MFC after: 3 days.


# eaed60f7 19-Jan-2014 Alexander Motin <mav@FreeBSD.org>

Removed unneeded and dangerous assignment. It would probably cause NULL
refererence panic if compiler not optimize it out.

Found with: Clang static analyzer
MFC after: 2 weeks


# f8c79813 16-Nov-2013 Alexander Motin <mav@FreeBSD.org>

In addition to r258220 allow shrinking in "automatic" mode if there is
already valid metadata found at the new location. This should allow easy
transparent recovery if first resize was done by mistake.

While there, unify metadata write code and fix minor memory leak.

MFC after: 1 month


# e6afd72b 16-Nov-2013 Alexander Motin <mav@FreeBSD.org>

Implement automatic live resize support for GEOM MULTIPATH class.

In "manual" mode just automatically resize provider in any direction.
In "automatic" mode allow only growth (with new metadata write); in case
of shrinking destroy the multipath device same as before since it may be
undesirable to write new metadata within old user area.

MFC after: 1 month


# 40ea77a0 22-Oct-2013 Alexander Motin <mav@FreeBSD.org>

Merge GEOM direct dispatch changes from the projects/camlock branch.

When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.

The defined now safety requirements are:
- caller should not hold any locks and should be reenterable;
- callee should not depend on GEOM dual-threaded concurency semantics;
- on the way down, if request is unmapped while callee doesn't support it,
the context should be sleepable;
- kernel thread stack usage should be below 50%.

To keep compatibility with GEOM classes not meeting above requirements
new provider and consumer flags added:
- G_CF_DIRECT_SEND -- consumer code meets caller requirements (request);
- G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done);
- G_PF_DIRECT_SEND -- provider code meets caller requirements (done);
- G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request).
Capable GEOM class can set them, allowing direct dispatch in cases where
it is safe. If any of requirements are not met, request is queued to
g_up or g_down thread same as before.

Such GEOM classes were reviewed and updated to support direct dispatch:
CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE,
VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL,
MAP, FLASHMAP, etc).

To declare direct completion capability disk(9) KPI got new flag equivalent
to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk
drivers got it set now thanks to earlier CAM locking work.

This change more then twice increases peak block storage performance on
systems with manu CPUs, together with earlier CAM locking changes reaching
more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to
256 user-level threads).

Sponsored by: iXsystems, Inc.
MFC after: 2 months


# f4673017 25-Mar-2013 Alexander Motin <mav@FreeBSD.org>

Make GEOM MULTIPATH to report unmapped bio support if underling path report
it. GEOM MULTIPATH itself never touches the data and so transparent.


# 02c62349 19-Nov-2012 Jaakko Heinonen <jh@FreeBSD.org>

- Don't pass geom and provider names as format strings.
- Add __printflike() attributes.
- Remove an extra argument for the g_new_geomf() call in swapongeom_ev().

Reviewed by: pjd


# 8fb378d6 25-Aug-2012 Thomas Quinot <thomas@FreeBSD.org>

(g_multipath_rotate): Fix algorithm so that it does rotate over all good
providers, not just the last two.

PR: kern/170379
Reviewed by: mav
MFC after: 2 weeks


# 71ee4ef0 03-Aug-2012 Thomas Quinot <thomas@FreeBSD.org>

New command "gmultipath prefer" to force selection of a specified
provider in an Active/Passive configuration.

Reviewed by: mav
MFC after: 4 weeks


# a839e332 05-Jun-2012 Alexander Motin <mav@FreeBSD.org>

Add missing newlines into XML output.

MFC after: 3 days
Sponsored by: iXsystems, Inc.


# c0b1ef66 05-May-2012 Alexander Motin <mav@FreeBSD.org>

Fix `gmultipath configure` for big-endian machines.

MFC after: 1 week


# 63297dfd 18-Apr-2012 Alexander Motin <mav@FreeBSD.org>

Some improvements to GEOM MULTIPATH:
- Implement "configure" command to allow switching operation mode of
running device on-fly without destroying and recreation.
- Implement Active/Read mode as hybrid of Active/Active and Active/Passive.
In this mode all paths not marked FAIL may handle reads same time,
but unlike Active/Active only one path handles write requests at any
point in time. It allows to closer follow original write request order
if above layers need it for data consistency (not waiting for requisite
write completion before sending dependent write).
- Hide duplicate messages about device status change.
- Remove periodic thread wake up with 10Hz rate.

MFC after: 2 weeks
Sponsored by: iXsystems, Inc.


# 0c883cef 12-Nov-2011 Alexander Motin <mav@FreeBSD.org>

Major GEOM MULTIPATH class rewrite:
- Improved locking and destruction process to fix crashes.
- Improved "automatic" configuration method to make it consistent and safe
by reading metadata back from all specified paths after writing to one.
- Added provider size check to reduce chance of ordering conflict with
other GEOM classes.
- Added "manual" configuration method without using on-disk metadata.
- Added "add" and "remove" commands to allow manage paths manually.
- Failed paths are no longer dropped from geom, but only marked as FAIL
and excluded from I/O operations.
- Automatically restore failed paths when all others paths are marked
as failed, for example, because of device-caused (not transport) errors.
- Added "fail" and "restore" commands to manually control FAIL flag.
- geom is now destroyed on last path disconnection.
- Added optional Active/Active mode support. Unlike Active/Passive
mode, load evenly distributed between all working paths. If supported by
the device, it allows to significantly improve performance, utilizing
bandwidth of all paths. It is controlled by -A option during creation.
Disabled by default now.
- Improved `status` and `list` commands output.

Sponsored by: iXsystems, inc.
MFC after: 1 month


# 6472ac3d 07-Nov-2011 Ed Schouten <ed@FreeBSD.org>

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


# 5d807a0e 10-Jul-2011 Andrey V. Elsukov <ae@FreeBSD.org>

Include sys/sbuf.h directly.

Reviewed by: pjd


# eb8e9abe 04-May-2011 Andrey V. Elsukov <ae@FreeBSD.org>

Remove unneeded code.

MFC after: 1 week


# cb08c2cc 25-Feb-2011 Alexander Leidinger <netchild@FreeBSD.org>

Add some FEATURE macros for various GEOM classes.

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by: Google Summer of Code 2010
Submitted by: kibab
Reviewed by: silence on geom@ during 2 weeks
X-MFC after: to be determined in last commit with code from this project


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 87e7f7be 14-May-2010 Matt Jacob <mjacob@FreeBSD.org>

Yet another potential dereference of a dead provider.

Sponsored by: Panasas
MFC after: 1 week


# 1371a457 14-May-2010 Matt Jacob <mjacob@FreeBSD.org>

Make sure to check that the active provider pointer points to something before
dereferencing the pointer.

Sponsored by: Pansas
MFC after: 1 week


# a9560231 23-Apr-2010 Matt Jacob <mjacob@FreeBSD.org>

This is an MFC of 205847, 204071 and 196580

------
Change how multipath labels are created and managed. This makes it easier
to support various storage boxes which really aren't active-active.

We only write the label on the *first* provider. For all other providers
we just "add" the disk. This also allows for an "add" verb.

A usage implication is that you should specificy the currently active
storage path as the first provider.

Note that this does not add RDAC-like functionality, but better allows for
autovolumefailover configurations (additional checkins elsewhere will support
this).

------------------------------------------------------------------------

- Style fixes.
- Prefer strlcpy() over strncpy().

------------------------------------------------------------------------

There's no need for checking result of M_WAITOK allocation.


# 2ef84eca 23-Apr-2010 Matt Jacob <mjacob@FreeBSD.org>

This is an MFC of 205412.

Add 'rotate' and 'getactive' verbs to provide some control and information
about what the currently active path is.


# 2b4969ff 29-Mar-2010 Matt Jacob <mjacob@FreeBSD.org>

Change how multipath labels are created and managed. This makes it easier
to support various storage boxes which really aren't active-active.

We only write the label on the *first* provider. For all other providers
we just "add" the disk. This also allows for an "add" verb.

A usage implication is that you should specificy the currently active
storage path as the first provider.

Note that this does not add RDAC-like functionality, but better allows for
autovolumefailover configurations (additional checkins elsewhere will support
this).

Sponsored by: Panasas
MFC after: 1 month


# b5dce617 21-Mar-2010 Matt Jacob <mjacob@FreeBSD.org>

Add 'rotate' and 'getactive' verbs to provide some control and information
about what the currently active path is.

Sponsored by: Panasas
MFC after: 1 month


# 12f35a61 18-Feb-2010 Pawel Jakub Dawidek <pjd@FreeBSD.org>

- Style fixes.
- Prefer strlcpy() over strncpy().


# 264a8db4 07-Sep-2009 Pawel Jakub Dawidek <pjd@FreeBSD.org>

MFC r196579:

Fix an obvious topology lock leak.

Approved by: re (kib)


# 07a93e6b 27-Aug-2009 Pawel Jakub Dawidek <pjd@FreeBSD.org>

There's no need for checking result of M_WAITOK allocation.


# c16ce31b 27-Aug-2009 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Fix an obvious topology lock leak.

MFC after: 3 days


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 3745c395 20-Oct-2007 Julian Elischer <julian@FreeBSD.org>

Rename the kthread_xxx (e.g. kthread_create()) calls
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.

I'd LOVE to do this rename in 7.0 so that we can eventually MFC the
new kthread_xxx() calls.


# e770bc6b 26-Feb-2007 Matt Jacob <mjacob@FreeBSD.org>

First cut at GEOM based multipath. This is an active/passive{/passive...}
arrangement that has no intrinsic internal knowledge of whether devices
it is given are truly multipath devices. As such, this is a simplistic
approach, but still a useful one.

The basic approach is to (at present- this will change soon) use camcontrol
to find likely identical devices and and label the trailing sector of the
first one. This label contains both a full UUID and a name. The name is
what is presented in /dev/multipath, but the UUID is used as a true
distinguishor at g_taste time, thus making sure we don't have chaos
on a shared SAN where everyone names their data multipath as "Fred".

The first of N identical devices (and N *may* be 1!) becomes the active
path until a BIO request is failed with EIO or ENXIO. When this occurs,
the active disk is ripped away and the next in a list is picked to
(retry and) continue with.

During g_taste events new disks that meet the match criteria for existing
multipath geoms get added to the tail end of the list.

Thus, this active/passive setup actually does work for devices which
go away and come back, as do (now) mpt(4) and isp(4) SAN based disks.

There is still a lot to do to improve this- like about 5 of the 12
recommendations I've received about it, but it's been functional enough
for a while that it deserves a broader test base.

Reviewed by: pjd
Sponsored by: IronPort Systems
MFC: 2 months