History log of /freebsd-10.3-release/sys/geom/raid3/
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
296373 04-Mar-2016 marius

- Copy stable/10@296371 to releng/10.3 in preparation for 10.3-RC1
builds.
- Update newvers.sh to reflect RC1.
- Update __FreeBSD_version to reflect 10.3.
- Update default pkg(8) configuration to use the quarterly branch.

Approved by: re (implicit)

256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


245456 15-Jan-2013 mav

Allow to insert new component to geom_raid3 without specifying number.

PR: kern/160562
MFC after: 2 weeks


245444 15-Jan-2013 mav

Alike to r242314 for GRAID make GRAID3 more aggressive in marking volumes
as clean on shutdown and move that action from shutdown_pre_sync stage to
shutdown_post_sync to avoid extra flapping.

ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID
to shutdown gracefully. To handle that, mark volume as clean just when
shutdown time comes and there are no active writes.

MFC after: 2 weeks


240371 11-Sep-2012 glebius

When synchronizing, include in the config dump amount of
bytes syncronized.
The rationale behind this is the following: for large disks the
percent synchronisation counter ticks too seldom, and monitoring
software (as well as human operator) can't tell whether
synchronisation goes on or one of disks got stuck. On an idle
server one can look into gstat and see whether synchronisation goes
on or not, but on a busy server that won't work. Also, new value
monitored can be differentiated obtaining the synchronisation speed
quite precisely.

Submitted by: Konstantin Kukushkin <dark ramtel.ru>
Reviewed by: pjd


227309 07-Nov-2011 ed

Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.

The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.


223921 11-Jul-2011 ae

Include sys/sbuf.h directly.

Reviewed by: pjd


221101 27-Apr-2011 mav

Implement relaxed comparision for hardcoded provider names to make it
ignore adX/adaY difference in both directions to simplify migration to
the CAM-based ATA or back.


219029 25-Feb-2011 netchild

Add some FEATURE macros for various GEOM classes.

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by: Google Summer of Code 2010
Submitted by: kibab
Reviewed by: silence on geom@ during 2 weeks
X-MFC after: to be determined in last commit with code from this project


217305 12-Jan-2011 ae

Sector size can not be greater than MAXPHYS. Since GRAID3 calculates
sector size from user-specified block size, report to user about
big blocksize.

PR: kern/147851
MFC after: 1 week


201567 05-Jan-2010 mav

Move wakeup() out of mutex to reduce contention.


201545 05-Jan-2010 mav

Slightly optimize XOR calculation.


200940 24-Dec-2009 mav

As soon as geom_raid3 reports it's own stripe as sector size, report largest
underlying provider's stripe, multiplied by number of data disks in array,
due to transformation done, as array stripe.


200821 21-Dec-2009 mav

Make graid3 fallback to malloc() when component request size is bigger
then maximal prepared UMA zone size. This fixes crash with MAXPHYS > 128K.


190878 10-Apr-2009 thompsa

Revert r190676,190677

The geom and CAM changes for root_hold are the wrong solution for USB design
quirks.

Requested by: scottl


190676 03-Apr-2009 thompsa

Add a how argument to root_mount_hold() so it can be passed NOWAIT and be called
in situations where sleeping isnt allowed.


172836 20-Oct-2007 julian

Rename the kthread_xxx (e.g. kthread_create()) calls
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.

I'd LOVE to do this rename in 7.0 so that we can eventually MFC the
new kthread_xxx() calls.


170307 05-Jun-2007 jeff

Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
sychronization.
- Use the per-process spinlock rather than the sched_lock for per-process
scheduling synchronization.

Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)


163888 01-Nov-2006 pjd

Now, that we have gjournal in the tree add possibility to configure
gmirror and graid3 in a way that it is not resynchronized after a
power failure or system crash.
It is safe when gjournal is running on top of gmirror/graid3.


163886 01-Nov-2006 pjd

Change spaces to tabs where needed.


163836 31-Oct-2006 pjd

Implement BIO_FLUSH handling by simply passing it down to the components.

Sponsored by: home.pl


163206 10-Oct-2006 pjd

Guard against invalid metadata.

MFC after: 1 week


162835 30-Sep-2006 pjd

One more white space fix.


162832 30-Sep-2006 pjd

Remove trailing spaces.


162350 16-Sep-2006 pjd

Small fixes after adding __printflike() to gctl_error().

Approved by: phk
MFC after: 3 days


162282 13-Sep-2006 pjd

Fix synchronization in gmirror and graid3 which I broken. Synchronization
request can still have bio_to set to sc_provider (this is READ part of a
synchronization request) and in this case g_{mirror,raid3}_sync() wasn't
called as it should be.

MFC after: 1 week


162188 09-Sep-2006 jmg

move created/detected/activated under debug level 1 to quiet the common case..

add count of active and total components to the launched line so you can
see at a glance if your mirror/raid3 is complete...

now:
GEOM_MIRROR: Device mirror/sam launched (2/2).

Reviewed by: pjd


161116 09-Aug-2006 pjd

Not only a request from us can be passed to g_{mirror,raid3}_worker()
function, but also a request to us, in which case checking bio_cflags
is wrong, because the class above us is controling it, not we.

MFC after: 1 week


160964 04-Aug-2006 yar

Commit the results of the typo hunt by Darren Pilgrim.
This change affects documentation and comments only,
no real code involved.

PR: misc/101245
Submitted by: Darren Pilgrim <darren pilgrim bitfreak org>
Tested by: md5(1)
MFC after: 1 week


160895 01-Aug-2006 pjd

Don't use f-word in comments. We are gentlemans.

Pointed out by: Maciej Sobczak


160330 13-Jul-2006 pjd

Always allow to specify components with /dev/ prefix.

MFC after: 3 days


160248 10-Jul-2006 pjd

Use proper defines instead of magic values.

MFC after: 1 week


160203 09-Jul-2006 pjd

When kern.geom.raid3.use_malloc tunnable is set to 1, malloc(9) instead of
uma(9) will be used for memory allocation.
In case of problems or tracking bugs, there are more useful tools for malloc(9)
debugging than for uma(9) debugging, like memguard(9) and redzone(9).

MFC after: 1 week


160155 07-Jul-2006 pjd

Remove bogus assertion.

Reported by: Bradley W. Dutton <brad-fbsd-stable@duttonbros.com>
MFC after: 3 days


160081 03-Jul-2006 pjd

Allow to close access even if device is already destroyed.

Reported by: Ulrich Spoerlein <uspoerlein@gmail.com>
PR: kern/98093
MFC after: 1 week


158290 04-May-2006 pjd

Use G_RAID3_FOREACH_SAFE_BIO() macro instead of G_RAID3_FOREACH_BIO() in
two places where g_io_request() is called. g_io_request() can free bio
structure so we can't reference it after and G_RAID3_FOREACH_BIO() macro
was doing this.

Found by: Coverity Prevent analysis tool (with my new models)
MFC after: 1 day


158195 30-Apr-2006 pjd

We shouldn't lock the topology here - we will panic on assertion inside
g_raid3_bump_syncid().

Reported by: Bradley W. Dutton <brad-fbsd-stable@duttonbros.com>
MFC after: 1 day


158117 28-Apr-2006 pjd

- Don't hold the device sx lock when going to sleep.
- Prevent possible live-lock in case of memory problems by freeing
already completed requests first.

Reported and tested by: markus, Bradley W. Dutton <brad-fbsd-stable@duttonbros.com>
MFC after: 1 day


158116 28-Apr-2006 pjd

- Remove dead code.
- Comment possible event miss, which isn't critical, but probably can be
fixed by replacing the event lock usage with the queue lock.

MFC after: 2 weeks


158114 28-Apr-2006 pjd

Be sure to not destroy device twice. This is not possible in theory, but
with this change there is even no theoretical race.

MFC after: 2 weeks


157838 18-Apr-2006 pjd

Fix storing offset of already synchronized data. Offset in entire array was
stored in metadata instead of an offset in single disk.
After reboot/crash synchronization process started from a wrong offset
skipping (not synchronizing) part of the component which can lead to data
corrutpion (when synchronization process was interrupted on initial
synchronization) or other strange situations like 'graid3 status' showing
value more than 100%.

Reported, reviewed and tested by: ru
Reported by: Dmitry Morozovsky <marck@rinet.ru>
MFC after: 1 day


157630 10-Apr-2006 pjd

Introduce and use delayed-destruction functionality from a pre-sync hook,
which means that devices will be destroyed on last close.

This fixes destruction order problems when, eg. RAID3 array is build on
top of RAID1 arrays.

Requested, reviewed and tested by: ru
MFC after: 2 weeks


157222 28-Mar-2006 pjd

Preserve previous behaviour of kern.geom.raid3.n{64,16,4}k tunables were 0
means unlimited.

Reported by: ru
MFC after: 3 days


157134 25-Mar-2006 pjd

Increase debug level for "Thread exiting." message. It's not that important
and is 0 by accident.

MFC after: 3 days


156878 19-Mar-2006 pjd

Update copyright for 2006.


156876 19-Mar-2006 pjd

kern.geom.raid3.sync_requests=2 seems to be a better default - it still
keeps disks very busy, but makes system much more responsive.

While here, kill extra space.


156684 13-Mar-2006 ru

Fix build on 64-bit platforms.


156612 13-Mar-2006 pjd

- Reimplement I/O data allocation to prevent deadlocks.

Submitted by: green

- Speed up synchronization process by using configurable number of I/O
requests in parallel.
+ Add kern.geom.raid3.sync_requests tunable which defines how many parallel
I/O requests should be used.
+ Retire kern.geom.raid3.reqs_per_sync and kern.geom.raid3.syncs_per_sec
sysctls.
- Fix race between regular and synchronization requests.
- Reimplement raid3's data synchronization - do not use the topology lock
for this purpose, as it may case deadlocks.
- Stop synchronization from pre-sync hook.
- Fix some other minor issues.

Tested by: Mike Tancsa <mike@sentex.net>
MFC after: 3 days


156527 10-Mar-2006 pjd

When inserting a new component md_provsize metadata field wasn't set, which
means that old problem was triggered (when two providers end at the same
offset, eg. ad0 and ad0s1 and the wrong was is picked up by gmirror/graid3).

Reported by: Michal Suszko <dry@dry.pl>
MFC after: 3 days


155906 22-Feb-2006 pjd

Do not use bio structure after g_io_deliver(), it may not longer by valid.

Found and fixed by: Vsevolod Lobko <seva@ip.net.ua>
MFC after: 3 days


155582 12-Feb-2006 pjd

On component state change to ACTIVE don't forget to update metadata.

MFC after: 3 days


155581 12-Feb-2006 pjd

Use time_uptime instead of time_second, as the latter may go backwards.

Suggested by: ru
MFC after: 3 days


155560 12-Feb-2006 pjd

Allow to set kern.geom.raid3.disconnect_on_failure from loader.conf.

MFC after: 3 days


155546 11-Feb-2006 pjd

- Add kern.geom.raid3.disconnect_on_failure sysctl/tunnable (default to 1
to preserve currect behaviour). When set to 0, components are not
disconnected - graid3 will try to still use them (only first error will
be logged). This is helpful when we have two broken components, but in
different places, so actually all data is available.
Such buggy component will be visible in 'graid3 list' output with flag
BROKEN.
- Never disconnect the last valid component. If we detect errors there we
will just pass them up. This wasn't reasonable to deny access to the
whole provider because of one broken sector.

Prodded by: ru
MFC after: 3 days


155544 11-Feb-2006 pjd

Correct typo. 'fbp' is NULL here so this will result in a panic.

MFC after: 3 days


155540 11-Feb-2006 pjd

Mark array as CLEAN when there are no write requests in
kern.geom.raid3.idletime seconds. Write, not any requests.
Mark array as clean immediatelly on last write close.

Prodded by: ru
MFC after: 3 days


155174 01-Feb-2006 pjd

Remove trailing spaces.


155070 30-Jan-2006 pjd

Fix typo which cased that 64kB elements limit was not set properly and
16kB elements limit wasn't set at all.

Submitted by: Vsevolod Lobko <seva@ip.net.ua>
MFC after: 3 days


154539 18-Jan-2006 pjd

Remove dead code.

Found by: Coverity Prevent(tm)
Coverity ID: CID105
MFC after: 3 days


152967 30-Nov-2005 sobomax

Check for g_read_data(9) errors properly:

o The only indication of error condition is NULL value returned by
the function;

o value pointed to by error argument is undefined in the case when
operation completes successfully.

Discussed with: phk


151897 31-Oct-2005 rwatson

Normalize a significant number of kernel malloc type names:

- Prefer '_' to ' ', as it results in more easily parsed results in
memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion. Similar changes are required for UMA zone names.


151822 28-Oct-2005 pjd

Fix possible live-lock under heavy load where we can't allocate more
memory for request.
I was sure graid3 should handle such situations well, but green@ reported
it is not and we want to fix it before 6.0.

Submitted by: green


148440 27-Jul-2005 pjd

Use root_mount KPI for RAID3 to delay root file system mount.
Actually, one cannot setup root file system on RAID3 device, but when
other file system exist in /etc/fstab which are placed on RAID3 device,
boot process will be interrupted when these devices are missing.

MFC after: 3 days
X-MFC-note: MFC only to RELENG_6, as RELENG_5 doesn't have root_mount KPI.


146118 11-May-2005 pjd

cp can't be NULL.

Noticed by: Coverity Prevent analysis tool


146117 11-May-2005 pjd

gp can't be NULL.

Noticed by: Coverity Prevent analysis tool


144144 26-Mar-2005 pjd

If an error occurs, clean up before returning from g_raid3_connect_disk().


144142 26-Mar-2005 pjd

Check for return values.

Submitted by: sam
Found by: Coverity Prevent analysis tool


142727 27-Feb-2005 pjd

- Add md_provsize field to metadata, which will help with
shared-last-sector problem.
After this change, even if there is more than one provider with the same
last sector, the proper one will be chosen based on its size.
It still doesn't fix the 'c' partition problem (when da0s1 can be confused
with da0s1c) and situation when 'a' partition starts at offset 0
(then da0s1a can be confused with da0s1 and da0s1c). One can use '-h'
option there, when creating device or avoid sharing last sector.
Actually, when providers share the same last sector and their size is equal,
they provide exactly the same data, so the name (da0s1, da0s1a, da0s1c)
isn't important at all.
- Provide backward compatibility.
- Update copyright's year.

MFC after: 1 week


141994 16-Feb-2005 pjd

Update copyright in files changed this year.


139940 09-Jan-2005 pjd

Increase default synchronization speed.

MFC after: 3 days


139671 04-Jan-2005 pjd

- Fix 'rebuild' command - it can no longer relay on retaste event
(we ignore it).
- Remove code used for handling spoil events, as spoiling is not possible
anymore, because we keep consumers open for writing all the time.

MFC after: 4 days


139622 03-Jan-2005 pjd

Remove unused #include.


139451 30-Dec-2004 jhb

Stop explicitly touching td_base_pri outside of the scheduler and simply
set a thread's priority via sched_prio() when that is the desired action.
The schedulers will start managing td_base_pri internally shortly.


139379 28-Dec-2004 pjd

Remove debug code.


139295 25-Dec-2004 pjd

- Add genid field to the metadata which will allow to improve reliability a bit.
After this change, when component is disconnected because of an I/O error,
it will not be connected and synchronized automatically, it will be logged
as broken and skipped. Autosynchronization can occur, when component is
disconnected (on orphan event) and connected again - there were no I/O
error, so there is no need to not connected the component, but when there were
writes while it wasn't connected, it will be synchronized.
This fix cases, when component is disconnected because of I/O error and can be
connected again and again.
- Bump version number.
- Implement backward compatibility mechanism. After this change when metadata in
old version is detected, it is automatically upgraded to the new (current)
version.


139146 21-Dec-2004 pjd

Now, when force device destruction is done on shutdown, hide warning,
that device cannot be destroyed immediately, under debug=1.

Suggested by: simon


139144 21-Dec-2004 pjd

Improve reliability and clean up code a bit.
For more details check src/sys/geom/mirror/g_mirror.c rev.1.47,1.48,1.49,1.50.


138801 13-Dec-2004 pjd

bioq_insert_head() function is already in subr_disk.c.


138374 04-Dec-2004 pjd

When initializing device, set d_softc and d_no fields for all components,
because we know it then and we need it when inserting a component which
wasn't destroyed while device was running.

Reported by: Michael Handler <handler@grendel.net>
MFC after: 1 week


137490 09-Nov-2004 pjd

Before trying to update metadata (so open consumer for writing), be sure
that the events queue is empty. In other case we're able to hit the race
where for example da0s1 is tasted by some other class, which means that
da0 is open with exclusive bit set, which means that we can't open da0
for writing if it is our component.

Reported by: Attila Nagy <bra@fsn.hu> (and somebody else sometime ago,
but I cannot find who it was)


137487 09-Nov-2004 pjd

Don't rely on DIRTY flag to be sure that consumer if open, because
DIRTY flag can be removed in idle process. Use consumer's acw field
instead to avoid opening consumer twice.


137485 09-Nov-2004 pjd

For BIO_READ check if provider is open for reading and for BIO_WRITE,
check if provider is open for writing.
This fixes panic when device is open only for writing and we send write
request.


137421 09-Nov-2004 pjd

Drop Giant lock before grabbing the topology lock.


137412 08-Nov-2004 pjd

If device is marked as beeing destroyed, deny all access requests.


137259 05-Nov-2004 pjd

Don't forget to make sure that there are no not-finished requests before
marking components as clean.

Pointed out by: scottl


137258 05-Nov-2004 pjd

- Mark all raid3 components as clean after kern.geom.raid3.idletime seconds.
- Make kern.geom.raid3.timeout variable tunable.


137257 05-Nov-2004 pjd

Mark raid3 devices as clean on shutdown (after all file systems are
unmounted).

Suggested by: scottl


137256 05-Nov-2004 pjd

- Use ->index consumer's field to track number of in-flight requests.
- Remove unused #include.


135872 28-Sep-2004 pjd

Just use MAXPHYS as maximum I/O request size, instead of using my own
#define for this purpose.
No functional change.


135866 27-Sep-2004 pjd

Decrease kern.geom.raid3.timeout to 4, so it is smaller than
vfs.root.mountdelay by default.


135863 27-Sep-2004 pjd

Avoid race while synchronizing components. It is very hard to bump into,
but it is possible:
1. Read data from good component for synchronization.
2. Write data to the same area.
3. Write synchronization data, which are now stale.

Found by: tegge (for gmirror)


135522 20-Sep-2004 pjd

This is not needed anymore, it is forced in GEOM now.
Actually, it can even cause some problems, because GEOM requires sectorsize
to be more than 0 on first access, not on provider creation, so we can skip
valid providers by doing this check here.

Reported by: Divacky Roman <xdivac02@stud.fit.vutbr.cz>
Sven Willenberger <sven@dmv.com>


134528 30-Aug-2004 pjd

Allow to configure debug level from /boot/loader.conf.


134486 29-Aug-2004 pjd

GCC, ehh.


134421 28-Aug-2004 pjd

Use sc->sc_mediasize instead of sc->sc_provider->mediasize which contains
exactly the same value, but is shorter.


134420 28-Aug-2004 pjd

Warn the user if we are not going to use whole provider space.

Requested by: Michael Handler <handler@grendel.net>


134418 28-Aug-2004 pjd

Don't allow to insert providers, which are too small.

Reported by: Michael Handler <handler@grendel.net>


134344 26-Aug-2004 pjd

Skip providers with not defined sector size.

Reported by: kuriyama


134303 25-Aug-2004 pjd

Log verification errors at level 1.


134168 22-Aug-2004 pjd

Implementation of 'verify reading' algorithm, which uses parity data for
verification of regular data when device is in complete state.
On verification error, EIO error is returned for the bio and sysctl
kern.geom.raid3.stat.parity_mismatch is increased.

Suggested by: phk


134136 21-Aug-2004 pjd

Add version history.


134124 21-Aug-2004 pjd

Implement new reading algorithm, which will use parity component for reading
as well, even if device is in complete state.
I observe 40% of speed-up with this option for random read operations,
but slowdown for sequential reads.
Basically, without this option reading from a RAID3 device built from 5
components (c0-c4) looks like this:

Request no. Used components
1 c0+c1+c2+c3
2 c0+c1+c2+c3
3 c0+c1+c2+c3

With the new feature:

Request no. Used components
1 c0+c1+c2+c3
2 (c1^c2^c3^c4)+c1+c2+c3
3 c0+(c0^c2^c3^c4)+c2+c3
4 c0+c1+(c0^c1^c3^c4)+c3
5 c0+c1+c2+(c0^c1^c2^c4)
6 c0+c1+c2+c3
[...]


133991 18-Aug-2004 pjd

We really don't want to receive spoil event for synchroniztion consumers.


133979 18-Aug-2004 pjd

Dump device status on 'list' command.


133839 16-Aug-2004 obrien

Minor style.9 cleanup.


133825 16-Aug-2004 pjd

Decrease debug level to 0.


133823 16-Aug-2004 pjd

Fix warning.


133808 16-Aug-2004 pjd

Introduce GEOM RAID3 class, i.e. kernel module, which implements RAID3
transformation and graid3(8) userland utility, which can be used for
configuration. No manual page yet, sorry.

Hardware provided by: Daniel Seuffert