History log of /netbsd-current/sys/dev/raidframe/rf_reconstruct.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.129 17-Sep-2023 oster

Implement hot removal of spares and components. From manu@.

Implement a long desired feature of automatically incorporating
a used spare into the array after a reconstruct.

Given the configuration:
Components:
/dev/wd0e: failed
/dev/wd1e: optimal
/dev/wd2e: optimal
Spares:
/dev/wd3e: spare

Running 'raidctl -F /dev/wd0e raid0' will now result in the
following configuration after a successful rebuild:
Components:
/dev/wd3e: optimal
/dev/wd1e: optimal
/dev/wd2e: optimal
No spares.

Thanks to manu@ for the development of the initial set of changes
which allowed the changes to automatically incorporate a used spare
to come to fruition. Thanks also to manu@ for useful discussions
about and additional testing of these changes.


# 1.128 08-Sep-2023 oster

Revision 1.104 actually fixed the issues that were preventing
us from freeing the ReconControl structures. So free them
and thus also prevent a panic on shutdown due to items not
being correctly returned to the pool.

Thanks to manu@ for report of the panic, and for initial testing
of the changes.

XXX pullup-9
XXX pullup-10


Revision tags: netbsd-10-base bouyer-sunxi-drm-base thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.127 27-Jul-2021 oster

branches: 1.127.10;
rf_CreateDiskQueueData() no longer uses waitflag, and will always succeed.
Cleanup the error path for the (no longer needed) PR_NOWAIT cases.


# 1.126 23-Jul-2021 oster

Extensive mechanical changes to the pools used in RAIDframe.

Alloclist remains not per-RAID, so initialize that pool
separately/differently than the rest.

The remainder of pools in RF_Pools_s are now per-RAID pools. Mostly
mechanical changes to functions to allocate/destroy per-RAID pools.
Needed to make raidPtr available in certain cases to be able to find
the per-RAID pools.

Extend rf_pool_init() to now populate a per-RAID wchan value that is
unique to each pool for a given RAID device.

TODO: Complete the analysis of the minimum number of items that are
required for each pool to allow IO to progress (i.e. so that a request
for pool resources can always be satisfied), and dynamically scale
minimum pool sizes based on RAID configuration.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.125 15-Feb-2021 oster

branches: 1.125.4;
Fix a long long-standing off-by-one error in computing lastPSID.

SUsPerPU is only really supported for a value of 1, and since the
first PSID is 0, the last will be numStripe-1. Also update the
setting of pending_writes to reflect the change to lastPSID.

Needs pullups to -8 and -9.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3 ad-namecache-base2 ad-namecache-base1 ad-namecache-base
# 1.124 08-Dec-2019 mlelstv

branches: 1.124.8;
Switch to vn_bdev_open* functions.


Revision tags: phil-wifi-20191119
# 1.123 10-Oct-2019 christos

fix the function pointer and callback mess:
- callback functions return 0 and their result is not checked; make them void.
- there are two types of callbacks and they used to overload their parameters
and the callback structure; separate them into "function" and "value"
callbacks.
- make the wait function signature consistent.


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.122 09-Feb-2019 christos

branches: 1.122.4;
- Change the allocation macros to be more like function calls
- Change sizeof(type) -> sizeof(*variable)
- Use macros for the long buffer length allocations
- Remove "bit polishing" memsets() -- do them only once
- Remove unnecessary casts

Thanks to oster@ for finding bugs and testing.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 pgoyette-compat-merge-20190127 pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 netbsd-8-0-RELEASE phil-wifi-base pgoyette-compat-0625 netbsd-8-0-RC2 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 netbsd-8-0-RC1 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226 nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base
# 1.121 14-Nov-2014 oster

branches: 1.121.12; 1.121.20;


Fix a long-standing bug related to rebooting while a
reconstruct-to-spare is underway but not yet complete.

The issue was that a component was being marked as a used_spare when
the rebuild started, not when the rebuild was actually finished.
Marking it as a used_spare meant that the component label on the spare
was being updated such that after a reboot the component would be
considered up-to-date, regardless of whether the rebuild actually
completed!

This fix includes:
1) Add an additional state "rf_ds_rebuilding_spare" which is used
to denote that a spare is currently being rebuilt from the live
components.
2) Update the comments on the disk states, which were out-of-sync
with reality.
3) When rebuilding to a spare component, that spare now enters the
state rf_ds_rebuilding_spare instead of the state rf_ds_used_spare.
4) When the rebuild is actually complete then the spare component
enters the rf_ds_used_spare state. rf_ds_used_spare is now used
exclusively for the case where the rebuilding to the spare has
completed successfully.

XXX: Someday we need to teach raidctl(8) about this new state, and
take out the backwards compatibility code in rf_netbsdkintf.c (see
RAIDFRAME_GET_INFO in raidioctl()). For today, this fix needs to be
generic enough that it can get backported without major grief.

XXX: Needs pullup to netbsd-5*, netbsd-6*, and netbsd-7

Fixes PR#49244.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.120 14-Jun-2014 hannken

branches: 1.120.2;
Change dk_lookup() to return an anonymous vnode not associated with
any file system. Change all consumers of dk_lookup() to get the
device from "v_rdev" instead of VOP_GETATTR() as specfs does not
support VOP_GETATTR(). Devices obtained with dk_lookup() will no
longer disappear on forced unmounts.

Fix for PR kern/48849 (root mirror raid fails on shutdown)

Welcome to 6.99.44


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base agc-symver-base
# 1.119 06-Mar-2013 yamt

branches: 1.119.10;
fix parens in a message


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.118 20-Feb-2012 oster

branches: 1.118.2;
Add logic to the main reconstruction loop to handle RAID5 with rotated
spares. While here, observe that we were actually doing one more
stripe than we thought we were, and correct that too (it didn't matter
for non-RAID5_RS, but it definitely does for RAID5_RS). Add some
bounds-checking at the beginning to handle the case where the number
of stripes in the set is smaller than the sliding reconstruction window.

XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.117 14-Oct-2011 hannken

branches: 1.117.2; 1.117.6; 1.117.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.116 03-Aug-2011 oster

Address part of PR kern/44972. From YAMAMOTO Takashi. Thanks!


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.115 28-May-2011 yamt

rf_ReconstructInPlace: don't leave a vnode open on errors.
fixes a part of PR/44972.


# 1.114 24-May-2011 buhrow

Suggested to oster@ and approved via private e-mail as a help to
people who are getting reconstruction failures.


# 1.113 11-May-2011 mrg

convert the main raidPtr mutex to a kmutex, and add a couple of cv's to
cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond.
convert all remaining simple_lock's to kmutexes (they're not used or compiled
right now... even with all options enabled) and remove the support for them.

this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs.


# 1.112 02-May-2011 mrg

convert rb_mutex to a kmutex/cv.


Revision tags: bouyer-quota2-nbase
# 1.111 19-Feb-2011 enami

Define accessors for number of blocks and partition size in the
component label and use them where appropriate. Disscussed on tech-kern.


Revision tags: bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.110 19-Nov-2010 dholland

branches: 1.110.2; 1.110.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4
# 1.109 01-Nov-2010 mrg

add support for >2TB raid devices.

- add two new members to the component label:
u_int numBlocksHi
u_int partitionSizeHi
and store the top 32 bits of the real number of blocks and
partition size. modify rf_print_component_label(),
rf_does_it_fit(), rf_AutoConfigureDisks() and
rf_ReconstructFailedDiskBasic().

- call disk_blocksize() after disk_attach() [ from mlelstv ]

- shift the block number relative to DEV_BSHIFT in raidstart()
and InitBP() so that accesses work for non 512-byte devices.
[ from mlelstv ]

- update rf_getdisksize() to use the new getdisksize() [ from
mlelstv. this part needs a separate change for netbsd-5. ]


reviewed by: oster, christos and darrenr


Revision tags: uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211
# 1.108 17-Nov-2009 jld

branches: 1.108.2; 1.108.4;
Finally commit the RAIDframe parity map Summer Of Code project.

Drastically reduces the amount of time spent rewriting parity after an
unclean shutdown by keeping better track of which regions might have had
outstanding writes. Enabled by default; can be disabled on a per-set
basis, or tuned, with the new raidctl(8) commands.

Discussed on tech-kern@ to a general air of approval; exhortations to
commit from mrg@, christos@, and others.

Thanks to Google for their sponsorship, oster@ for mentoring the
project, assorted developers for trying very hard to break it, and
probably more I'm forgetting.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base2 jym-xensuspend-base nick-hppapmap-base
# 1.107 11-Feb-2009 oster

If we see a RF_RECON_WRITE_ERROR event we know a write has finished and
we need to account for that. Failure to do so means we can end up
waiting forever for writes we think are outstanding, but which have
already completed.

Addresses the RAIDframe part of PR#40569. Thanks to Matthias Scheler
for reporting the issue and verifying the fix.


Revision tags: mjf-devfs2-base
# 1.106 20-Dec-2008 oster

branches: 1.106.2;
When unconfiguring an array where a reconstruct is in progress, abort
the reconstruct and wait for IOs to drain before pulling the plug.

Should fix the panic reported by der Mouse on tech-kern.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 haad-dm-base
# 1.105 23-Sep-2008 oster

branches: 1.105.2; 1.105.4;
Nuke unneeded printf(). Spotted by pooka@.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase wrstuden-revivesa-base
# 1.104 19-May-2008 oster

branches: 1.104.4;
Re-work some of the guts of the reconstruction code.

Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status pointers" are required for typical configurations. Convert
the reconstruction code to use a "sliding status window" which will
scale nicely regardless of the number of stripes/reconstruction units
in the RAID set. Convert the main reconstruction loop to rebuild the
array in chunks rather than in one big lump.

As part of these changes, introduce a function to kick any waiters on
the head separation callback list, and use that in the main
reconstruction event queue to wake up the waiters if things have
stalled. (I believe this may fix a race condition that could occur at
at least at the very end of a disk during reconstruction under heavy
IO load.)

Thanks to Brian Buhrow for all his help, support, and patience in
testing these changes.


Revision tags: yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.103 15-Apr-2008 oster

branches: 1.103.2; 1.103.4; 1.103.6;
A forced recon read should not default to indicating that the reads
for that disk have stopped, since this will bump us out of the normal
reconstruction loop prematurely.

Fixes the (mostly cosmetic) bug where the reconstruction
status values stop updating, and from raidctl it appears that
reconstruction has totally stalled (which it actually hasn't -- the
reconstruction does complete properly, but not in the normal way).


# 1.102 14-Apr-2008 oster

Print out the status value if a reconstruction read fails.
Don't print out write promotions during reconstruct unless
we are debugging reconstructs.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.101 26-Jan-2008 oster

branches: 1.101.6;
In a land before time, when kernel processes roamed the system, we
needed to keep track of the kernel process that opened a device in
order to close it with the right credentials. Flash forward to today
where curlwp is now quite sufficient.


Revision tags: bouyer-xeni386-merge1 vmlocking2-base3 bouyer-xeni386-nbase yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 bouyer-xeni386-base yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.100 26-Nov-2007 pooka

Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.99 21-Sep-2007 oster

branches: 1.99.6;
Fix wording in a comment and correct a debug line. From Olivier Cherrier
(via private mail). Thanks!


Revision tags: nick-csl-alignment-base5 matt-mips64-base
# 1.98 18-Jul-2007 ad

branches: 1.98.4; 1.98.6; 1.98.8;
Fix fallout from recent kthread changes.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.97 09-Jul-2007 ad

branches: 1.97.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.96 26-Jun-2007 cube

Change dk_lookup() to accept an additional argument of the type enum uio_seg
that tells whether the given path is in user space or kernel space, so it
can tell NDINIT().

While the raidframe calls were ok, both ccd(4) and cgd(4) were passing
pointers to user space data, which leads to strange error on i386, as
reported by Jukka Salmi on current-users.

The issue has been there since last august, I'm actually a bit surprised
that no one in the meantime has used ccd(4) or cgd(4) on an arch where it
would have simply faulted.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base ad-audiomp-base post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 newlock2-base netbsd-4-base
# 1.95 16-Nov-2006 christos

branches: 1.95.2; 1.95.8; 1.95.10; 1.95.16;
__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.94 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.93 27-Aug-2006 christos

branches: 1.93.2; 1.93.4;
- use dk_lookup instead of our home-spun version.
- allow raid to be configured in a wedge
- allow wedges to be configured in a raid
- add autoconfiguration of wedges in a raid


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.92 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.91 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.90 11-Dec-2005 christos

branches: 1.90.4; 1.90.6; 1.90.8; 1.90.10; 1.90.12;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.89 18-Jul-2005 oster

If rf_SubmitReconBuffer indicates the submission was blocked (for
whatever reason), return 0 instead of the default
RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result
in rf_ContinueReconstructFailedDisk() thinking that the given
component was "done" and breaking out of the main reconstruction loop
far too early. Reconstruction still worked correctly as long as there
were no errors, but RAIDframe wouldn't be in a position to properly
handle read/write errors during reconstruction.

This fixes the "raidctl's progress bar spins at 0% until
reconstruction finishes" problem.


# 1.88 08-Jun-2005 oster

branches: 1.88.2;
- initialize numRUsTotal before we indicate that we are doing a reconstruct.

- make numRUsComplete and numRUsTotal 64-bit quantities like
everything else that records this information.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.87 27-Feb-2005 perry

branches: 1.87.2;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.86 12-Feb-2005 oster

The 'next' argument to rf_CreateDiskQueueData is always NULL. Since
there is no particular reason to pass an extra NULL argument, turf it,
and initialize p->next to NULL within the function.


# 1.85 12-Feb-2005 oster

Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to
determine if we are willing to wait for memory to come from the
diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to
code calling rf_CreateDiskQueueData() with different expectations
(and/or blatent disregard) of what might happen if there were
insufficient pool resources.


# 1.84 06-Feb-2005 oster

It's not a bad idea to update the component labels whether or not the
reconstruction was successful.


# 1.83 05-Feb-2005 oster

rf_GetNextReconEvent() *will* return a valid event, so no need for
the assert. (we'd have panic'ed in there long before this assert
if that wasn't the case).

Minor whitespace changes.


# 1.82 05-Feb-2005 oster

Vastly improve the error handling in the case of a read/write error
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bailing and
leaving the system in the best, usable state possible.

- introduce rf_DrainReconEventQueue() to allow easy cleaning of the
reconstruction event queue

- change how we cleanup the floating recon buffers in
rf_FreeReconControl(). Detect the end of the list rather
than traversing according to a count.

- keep track of the number of pending reconstruction writes. In the
event of a read error, use this to wait long enough for the pending
writes to (hopefully) drain.

- more cleanup is still needed on this code, but I didn't want to
start mixing major functional changes with minor cleanups.

XXX: There is a known issue with pool items left outstanding due to
the IO failure, and this can show up in the form of a panic at the
tail end of a shutdown. This problem is much less severe than before
these changes, and the hope/plan is that this problem will go away
once this code gets overhauled again.


Revision tags: yamt-km-base
# 1.81 22-Jan-2005 oster

branches: 1.81.2;
Torch some #define's missed in last commit.


# 1.80 22-Jan-2005 oster

Reconstruction Descriptors are only allocated once per reconstruction,
and don't need their own pool or freelist or anything fancier than a
malloc/free.


# 1.79 18-Jan-2005 oster

ForceReconReadDoneProc() needs a return after doing the first
rf_CauseReconEvent().


Revision tags: kent-audio1-beforemerge
# 1.78 12-Dec-2004 oster

branches: 1.78.2;
The switch() in rf_ContinueReconstructFailedDisk() is never actually
used in non-simulation code, and thus is just wasting space (and
making the code more confusing to read!). Turf the switch, left-shift
the indentation of code, and nuke 'state' field of struct RF_RaidReconDesc_s.

No real functional changes.


Revision tags: kent-audio1-base
# 1.77 15-Nov-2004 oster

continueFunc and continueArg arn't used. Turf. Simplify calls to
rf_GetNextReconEvent().


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.76 18-Mar-2004 oster

branches: 1.76.4;
Re-work the locking mechanisms for reconstruct and PSS structures
such that we don't actually hold a simplelock while we are doing
a pool_get(), but that we still effectively protecting critical code.

This should fix all of the outstanding LOCKDEBUG warnings related to
rebuilding RAID sets.


# 1.75 13-Mar-2004 oster

- don't use rf_PrintUserStats() for recon statistics.
rf_PrintUserStats() was mean for the simulator, and doesn't provide
any real info in kernel-space, especially for reconstructs.
Reconstructing actually renders the stats even more useless, since it
resets them all to zero before the reconstruct starts!

- since rf_PrintUserStats() is no longer used, nuke it along with the
routines that feed it. Nothing was using this code, and if we ever
need it again, we know where to find it.


# 1.74 07-Mar-2004 oster

- Introduce rf_pools which contains all of the various global pools used
by RAIDframe. Convert all other RAIDframe global pools to use pools
defined within this new structure.
- Introduce rf_pool_init(), used for initializing a single pool in
RAIDframe. Teach each of the configuration routines to use
rf_pool_init().
- Cleanup a few pool-related comments.
- Cleanup revent initialization and #defines.
- Add a missing pool_destroy() for the reconbuffer pool.

(Saves another 1K off of an i386 GENERIC kernel, and makes
stuff a lot more readable)


# 1.73 07-Mar-2004 oster

- fix up initialization of rf_recond_pool
- introduce rf_reconbuffer_pool and teach rf_MakeReconBuffer() to use it


# 1.72 05-Mar-2004 oster

Use RF_INCLUDE_PARITY_DECLUSTERING_DS to #if-out more unneeded bits.
(We can't do RF_DISTRIBUTE_SPARE bits without the parity declustering stuff.)


# 1.71 03-Mar-2004 oster

Nuke some unnecessary casts. No functional changes.


# 1.70 03-Mar-2004 oster

Introduce RF_REVENT_READ_FAILED, RF_REVENT_WRITE_FAILED and RF_REVENT_FORCEREAD_FAILED.
This removes 3 more RF_PANIC()'s (but we'll currently still panic if any of these cases occur).
fix up a few printf's.
XXX: still needs more cleanup and testing (and be taught to not panic).


# 1.69 03-Mar-2004 oster

Cleanup function prototypes.


# 1.68 03-Mar-2004 oster

- cleanup memory allocation in rf_AllocPSStatus()
- change function signature of rf_LookupRUStatus(). The last argument
is now a pointer to a new PSS, in case one is needed. Rather than
having rf_LookupRUStatus() allocate a new PSS, we pre-allocate one
beforehand, where necessary, just in case.
- change callers of rf_lookupRUStatus() to deal with the new way of
calling rf_lookupRUStatus().

[no improvement or worsening of parity rebuild/initialization performance.]


# 1.67 01-Mar-2004 oster

Use RF_ACC_TRACE to #if out more chunks of code related only
to access tracing. (not turned on yet)


# 1.66 29-Feb-2004 oster

Adjust _rf_ShutdownCreate() so that it is willing to wait for more
memory. Since we only now ever "return(0)", just return (void)
instead.

Cleanup all uses of rf_ShutdownCreate() to not worry about
it ever failing. Shaves another 600 bytes off of an i386 GENERIC kernel.


# 1.65 04-Jan-2004 oster

raidPtr->reconControl->percentCompleted only gets used in one
debugging printf, and in rf_netbsdkintf.c. We can do the calculations
inside of RF_DEBUG_RECON for the one debugging printf, and only
perform the percentCompleted calculation "on demand" in the
rf_netbsdkintf.c case. Shaves a few more bytes off an i386 GENERIC
kernel, and ever-so-slightly decreases the amount of work performed
during a reconstruct.


# 1.64 31-Dec-2003 oster

Add in a bunch of RF_SIGNAL_COND()'s that were missing. Tidy up
a few lines.


# 1.63 31-Dec-2003 oster

Left-shift another else{} chunk. No functional changes.


# 1.62 31-Dec-2003 oster

left-shift the "else" part of the if(!lp_SubmitReconBuffer) condition.
Cleanup. No real functional changes, just more readable.


# 1.61 31-Dec-2003 oster

Negate a condition, and flip if/else parts. Preparation for left-shifting
the (now) else part. No real functional change.


# 1.60 30-Dec-2003 oster

Some days you wonder if some of the function declaration consistency
was just an accident in the first place. Cleanup function decls and
a few comments. [ok.. so I wasn't going to fix this many.. but once
you're on a roll....]


# 1.59 29-Dec-2003 oster

Let's see... raidPtr->recon_done_procs is never set to anything
(other than NULL when raidPtr is initialized). That means
SignalReconDone() never does anything useful. Bye-bye!

Say good-bye to recon_done_procs and recon_done_procs_mutex (and its
initializer) as well.


# 1.58 29-Dec-2003 oster

- first kick at a major reworking of RAIDframe's memory allocation code:
- all freelists converted to pools
- initialization of structure members in certain cases where
code was relying on specific allocation and usage properties
to keep structures in a "known state" (that doesn't work with
pools!).
- make most pool_get() be "PR_WAITOK" until they can be analyzed
further, and/or have proper error handling added.
- all RF_Mallocs zero the space returned, so there is no difference
between RF_Calloc and RF_Malloc. In fact, all the RF_Calloc()'s
do is tend to do is get things horribly confused.
Make RF_Malloc() the "general memory allocator", with
RF_MallocAndAdd() the "general memory allocator with
allocation list".
- some of these RF_Malloc's et al. are destined to disappear.
- remove rf_rdp_freelist entirely (it's not used anywhere!)
- remove: #include "rf_freelist.h"
- to the files that were relying on the above, add: #include "rf_general.h"
- add: #include "rf_debugMem.h" to rf_shutdown.h to make it happy
about the loss of: #include "rf_freelist.h".

This shrinks an i386 GENERIC kernel by approx 5K. RAIDframe now
weighs in at about 162K on i386.


# 1.57 29-Dec-2003 oster

[Having received a definite lack of strenuous objection, a small amount
of strenuous agreement, and some general agreement, this commit is
going ahead because it's now starting to block some other changes I
wish to make.]

Remove most of the support for the concept of "rows" from RAIDframe.
While the "row" interface has been exported to the world, RAIDframe
internals have really only supported a single row, even though they
have feigned support of multiple rows.

Nothing changes in configuration land -- config files still need to
specify a single row, etc. All auto-config structures remain fully
forward/backwards compatible.

The only visible difference to the average user should be a
reduction in the size of a GENERIC kernel (i386) by 4.5K. For those
of us trolling through RAIDframe kernel code, a lot of the driver
configuration code has become a LOT easier to read.


# 1.56 29-Jun-2003 fvdl

branches: 1.56.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.55 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.54 10-Apr-2003 simonb

Remove an assigned-to but unused variable.


# 1.53 21-Mar-2003 dsl

Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).


# 1.52 09-Feb-2003 jdolecek

constify some


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.51 19-Nov-2002 oster

For reconstructs, move checks for failed components to before the
kernel threads are created.


# 1.50 16-Nov-2002 oster

Cleanup more printfs.


# 1.49 15-Nov-2002 oster

After a rebuild-in-place, a reconstruct, or a copyback, we should
really be updating the component labels.


Revision tags: kqueue-aftermerge kqueue-beforemerge
# 1.48 18-Oct-2002 oster

Improve and/or re-arrange a number of locks. While much of the locking is
still a mess, and there are a number of unresolved issues here, this
gets us closer to being happier in LOCKDEBUG land.


# 1.47 06-Oct-2002 oster

Add a missing RF_LOCK_MUTEX().


# 1.46 06-Oct-2002 oster

Introduce a temp variable, and allocate the ReconCtrl structure before
we protect raidPtr. One less thing for LOCKDEBUG to complain about.


Revision tags: kqueue-base
# 1.45 23-Sep-2002 oster

Nuke "baddisk". Thanks to Simon B.


# 1.44 21-Sep-2002 oster

rf_RegisterReconDoneProc() isn't needed.

This is the last of the 'easy' ones that Krister made me aware of.
Total savings on i386 GENERIC kernel: 13151 bytes
RAIDframe in GENERIC is now at: 179033
Thanks again Krister!


# 1.43 19-Sep-2002 oster

Introduce and use RF_DEBUG_PSS, and save a few more bytes.


# 1.42 19-Sep-2002 oster

One signal will do, thanks.


# 1.41 17-Sep-2002 oster

Cast the RF_DEBUG_RECON net a little wider.


# 1.40 17-Sep-2002 oster

Rename RF_DEBUG_RECONBUFFER to RF_DEBUG_RECON in order to facilitate
disabling other stuff without having to introduce another #define.


# 1.39 16-Sep-2002 oster

Cleanup some comments.


# 1.38 16-Sep-2002 oster

rf_CheckFloatingRbufCount() is only really useful when debugging the
reconstruct buffer stuff. #if it out in the general case.


# 1.37 16-Sep-2002 oster

Cleanup some printf's, and disable some (debugging) output.


# 1.36 14-Sep-2002 oster

Everyone and their dog was using RF_ERRORMSG3 to print out the same
sort of error message, over and over again, in different files.
Rather than having the same text repeated in multiple .o files,
create a couple of little functions to do the printing, and save a
bundle of space. Also improves readability of code.


# 1.35 09-Sep-2002 oster

Disallow 'reconstruct-in-place' on a component that has failed
and has already been reconstructed to a hot spare.


Revision tags: gehenna-devsw-base
# 1.34 13-Jul-2002 oster

Nuke a redundant wakeup().


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.33 09-Jan-2002 oster

branches: 1.33.8;
Move a bunch of debugging stuff to be only used if DEBUG is turned on.


# 1.32 15-Nov-2001 lukem

don't need <sys/types.h> when including <sys/param.h>


# 1.31 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3
# 1.30 04-Oct-2001 oster

Step 2 of the disentanglement. We now look to <dev/raidframe/*> for
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.

No functional changes to RAIDframe.


Revision tags: thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.29 18-Jul-2001 thorpej

branches: 1.29.2;
bzero -> memset


# 1.28 14-Jun-2001 oster

branches: 1.28.2;
It's silly to need a parity rebuild after a reconstruction has completed.
If we've just reconstructed a disk, then the parity is known to
be correct. (XXX doesn't hold for RAID 6!)


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.27 26-Jan-2001 oster

branches: 1.27.2;
Ensure we update the 'partitionSize' field of the component labels
when doing a reconstruct or a copyback. If we don't, junk might be
there, and that could cause the component to be not correctly
autoconfigured on reboot. Thanks to Simon Burge for helping track this down.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.26 04-Jun-2000 oster

branches: 1.26.2;
Merge rf_update_component_labels() and rf_final_update_component_labels().


# 1.25 31-May-2000 oster

Oops.. reconstruction percentages were being reported incorrectly.
Thanks to Manuel Bouyer for noting this.


# 1.24 28-May-2000 oster

Umm.. Complete is not equal to 'left to do'. Fix the math.


# 1.23 28-May-2000 oster

- Add a mechanism for obtaining finer-grained 'progress' information
regarding reconstructs, copybacks, etc.

- RAID 0 doesn't do copybacks, but don't make raidctl sweat about it.


Revision tags: minoura-xpg4dl-base
# 1.22 13-Mar-2000 soren

branches: 1.22.2;
Fix doubled 'the's in comments.


# 1.21 07-Mar-2000 oster

Create a new rf_close_component() to handle vnode operations for closing
components. Teach rf_UnconfigureVnodes() how to use it, and tell
the copyback and reconstruction code about it too.


# 1.20 25-Feb-2000 oster

When we close autoconfigured components, we need to note that they
are no longer in 'autoconfigured' status.


# 1.19 25-Feb-2000 oster

Fix a (slightly) bogus status message.


# 1.18 24-Feb-2000 oster

Make sure we close auto-configured components appropriately when
attempting a rebuild-in-place.


# 1.17 23-Feb-2000 oster

Be more aggressive about updating component labels in the event
of a real component failure (or a simulated failure):
- add 'numNewFailures' to keep track of the number of disk failures
since mod_counter was last updated for each component label.
- make sure we call rf_update_component_labels() upon any component failure,
real or simulated.


# 1.16 23-Feb-2000 oster

Do a better job of (re)initializing the component labels after
a reconstruct or a copyback.


Revision tags: chs-ubc2-newbase
# 1.15 13-Feb-2000 oster

Get recent changes into the tree:
- make component_label variables more consistent (==> clabel)
- re-work incorrect component configuration code
- re-work disk configuration code
- cleanup initial configuration of raidPtr info
- add auto-detection of components and RAID sets (Disabled, for now)
- allow / on RAID sets (Disabled, for now)
- rename "config_disk_queue" to "rf_ConfigureDiskQueue" and properly prototype
in rf_diskqueue.h
- protect some headers with #if _KERNEL (XXX this needs to be fixed properly)
and cleanup header formatting.
- expand the component labels (yes, they should be backward/forward compatible)
- other bits and pieces (some function names are still bogus, and will get
changed soon)


# 1.14 09-Jan-2000 oster

Nuke dependencies on rf_cpuutils.h.


# 1.13 09-Jan-2000 oster

Nuke unused debugging stuff. Clean up a whole bunch of comments.


# 1.12 09-Jan-2000 oster

- move a bunch of function prototypes to rf_kintf.h
- general cleanup of a number of prototypes that were scattered around.


# 1.11 09-Jan-2000 oster

Nuke #if 0'ed code.


# 1.10 08-Jan-2000 oster

- nuke calls to rf_get_threadid() and associated #include
- change a bunch of debugging printfs from
"[%d] ...", tid (where tid is the "thread id")
to
"raid%d: ...", raidPtr->raidid
- other minor rototillage


# 1.9 05-Jan-2000 oster

- update RF_CREATE_THREAD to handle a 'process name' argument.
- fire up a new thread for parity re-writes, copybacks, and reconstructs.
The ioctl's which trigger these actions now return immediately.
- add progress accounting for the above actions.
- minor rototillage of rf_netbsdkintf.c to deal with all of the above.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base
# 1.8 14-Aug-1999 oster

branches: 1.8.2;
Remove a 'struct proc *'-passing abomination that's been bugging me
for quite some time.


# 1.7 13-Aug-1999 oster

rf_sys.h does not need to be #included in any of these files, and, actually,
is no longer needed at all.


# 1.6 13-Aug-1999 oster

Clean up reconstruction accounting a bit. While it worked before, it was
slightly broken in the case where the RAID set did not support reconstruction.


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 chs-ubc2-base netbsd-1-4-RELEASE netbsd-1-4-base
# 1.5 02-Mar-1999 oster

branches: 1.5.2;
Update for recent changes including component label support, clean
bits, rebuilding components in-place, adding hot spares, shutdownhooks, etc.


# 1.4 05-Feb-1999 oster

Phase 2 of the RAIDframe cleanup. The source is now closer to KNF
and is much easier to read. No functionality changes.


# 1.3 26-Jan-1999 oster

Nuke more bits of RAIDframe "demo" code. We're not "demoing" here,
we're doing the Real Thing!


# 1.2 26-Jan-1999 oster

RAIDframe cleanup, phase 1. Nuke simulator support, user-land driver,
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.

No functional changes to the kernel code in this commit.


Revision tags: kenh-if-detach-base
# 1.1 13-Nov-1998 oster

RAIDframe, version 1.1, from the Parallel Data Laboratory at
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.


# 1.128 08-Sep-2023 oster

Revision 1.104 actually fixed the issues that were preventing
us from freeing the ReconControl structures. So free them
and thus also prevent a panic on shutdown due to items not
being correctly returned to the pool.

Thanks to manu@ for report of the panic, and for initial testing
of the changes.

XXX pullup-9
XXX pullup-10


Revision tags: netbsd-10-base bouyer-sunxi-drm-base thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.127 27-Jul-2021 oster

branches: 1.127.10;
rf_CreateDiskQueueData() no longer uses waitflag, and will always succeed.
Cleanup the error path for the (no longer needed) PR_NOWAIT cases.


# 1.126 23-Jul-2021 oster

Extensive mechanical changes to the pools used in RAIDframe.

Alloclist remains not per-RAID, so initialize that pool
separately/differently than the rest.

The remainder of pools in RF_Pools_s are now per-RAID pools. Mostly
mechanical changes to functions to allocate/destroy per-RAID pools.
Needed to make raidPtr available in certain cases to be able to find
the per-RAID pools.

Extend rf_pool_init() to now populate a per-RAID wchan value that is
unique to each pool for a given RAID device.

TODO: Complete the analysis of the minimum number of items that are
required for each pool to allow IO to progress (i.e. so that a request
for pool resources can always be satisfied), and dynamically scale
minimum pool sizes based on RAID configuration.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base thorpej-futex-base
# 1.125 15-Feb-2021 oster

branches: 1.125.4;
Fix a long long-standing off-by-one error in computing lastPSID.

SUsPerPU is only really supported for a value of 1, and since the
first PSID is 0, the last will be numStripe-1. Also update the
setting of pending_writes to reflect the change to lastPSID.

Needs pullups to -8 and -9.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3 ad-namecache-base2 ad-namecache-base1 ad-namecache-base
# 1.124 08-Dec-2019 mlelstv

branches: 1.124.8;
Switch to vn_bdev_open* functions.


Revision tags: phil-wifi-20191119
# 1.123 10-Oct-2019 christos

fix the function pointer and callback mess:
- callback functions return 0 and their result is not checked; make them void.
- there are two types of callbacks and they used to overload their parameters
and the callback structure; separate them into "function" and "value"
callbacks.
- make the wait function signature consistent.


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.122 09-Feb-2019 christos

branches: 1.122.4;
- Change the allocation macros to be more like function calls
- Change sizeof(type) -> sizeof(*variable)
- Use macros for the long buffer length allocations
- Remove "bit polishing" memsets() -- do them only once
- Remove unnecessary casts

Thanks to oster@ for finding bugs and testing.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 pgoyette-compat-merge-20190127 pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 netbsd-8-0-RELEASE phil-wifi-base pgoyette-compat-0625 netbsd-8-0-RC2 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 netbsd-8-0-RC1 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226 nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base
# 1.121 14-Nov-2014 oster

branches: 1.121.12; 1.121.20;


Fix a long-standing bug related to rebooting while a
reconstruct-to-spare is underway but not yet complete.

The issue was that a component was being marked as a used_spare when
the rebuild started, not when the rebuild was actually finished.
Marking it as a used_spare meant that the component label on the spare
was being updated such that after a reboot the component would be
considered up-to-date, regardless of whether the rebuild actually
completed!

This fix includes:
1) Add an additional state "rf_ds_rebuilding_spare" which is used
to denote that a spare is currently being rebuilt from the live
components.
2) Update the comments on the disk states, which were out-of-sync
with reality.
3) When rebuilding to a spare component, that spare now enters the
state rf_ds_rebuilding_spare instead of the state rf_ds_used_spare.
4) When the rebuild is actually complete then the spare component
enters the rf_ds_used_spare state. rf_ds_used_spare is now used
exclusively for the case where the rebuilding to the spare has
completed successfully.

XXX: Someday we need to teach raidctl(8) about this new state, and
take out the backwards compatibility code in rf_netbsdkintf.c (see
RAIDFRAME_GET_INFO in raidioctl()). For today, this fix needs to be
generic enough that it can get backported without major grief.

XXX: Needs pullup to netbsd-5*, netbsd-6*, and netbsd-7

Fixes PR#49244.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.120 14-Jun-2014 hannken

branches: 1.120.2;
Change dk_lookup() to return an anonymous vnode not associated with
any file system. Change all consumers of dk_lookup() to get the
device from "v_rdev" instead of VOP_GETATTR() as specfs does not
support VOP_GETATTR(). Devices obtained with dk_lookup() will no
longer disappear on forced unmounts.

Fix for PR kern/48849 (root mirror raid fails on shutdown)

Welcome to 6.99.44


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base agc-symver-base
# 1.119 06-Mar-2013 yamt

branches: 1.119.10;
fix parens in a message


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.118 20-Feb-2012 oster

branches: 1.118.2;
Add logic to the main reconstruction loop to handle RAID5 with rotated
spares. While here, observe that we were actually doing one more
stripe than we thought we were, and correct that too (it didn't matter
for non-RAID5_RS, but it definitely does for RAID5_RS). Add some
bounds-checking at the beginning to handle the case where the number
of stripes in the set is smaller than the sliding reconstruction window.

XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.117 14-Oct-2011 hannken

branches: 1.117.2; 1.117.6; 1.117.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.116 03-Aug-2011 oster

Address part of PR kern/44972. From YAMAMOTO Takashi. Thanks!


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.115 28-May-2011 yamt

rf_ReconstructInPlace: don't leave a vnode open on errors.
fixes a part of PR/44972.


# 1.114 24-May-2011 buhrow

Suggested to oster@ and approved via private e-mail as a help to
people who are getting reconstruction failures.


# 1.113 11-May-2011 mrg

convert the main raidPtr mutex to a kmutex, and add a couple of cv's to
cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond.
convert all remaining simple_lock's to kmutexes (they're not used or compiled
right now... even with all options enabled) and remove the support for them.

this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs.


# 1.112 02-May-2011 mrg

convert rb_mutex to a kmutex/cv.


Revision tags: bouyer-quota2-nbase
# 1.111 19-Feb-2011 enami

Define accessors for number of blocks and partition size in the
component label and use them where appropriate. Disscussed on tech-kern.


Revision tags: bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.110 19-Nov-2010 dholland

branches: 1.110.2; 1.110.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4
# 1.109 01-Nov-2010 mrg

add support for >2TB raid devices.

- add two new members to the component label:
u_int numBlocksHi
u_int partitionSizeHi
and store the top 32 bits of the real number of blocks and
partition size. modify rf_print_component_label(),
rf_does_it_fit(), rf_AutoConfigureDisks() and
rf_ReconstructFailedDiskBasic().

- call disk_blocksize() after disk_attach() [ from mlelstv ]

- shift the block number relative to DEV_BSHIFT in raidstart()
and InitBP() so that accesses work for non 512-byte devices.
[ from mlelstv ]

- update rf_getdisksize() to use the new getdisksize() [ from
mlelstv. this part needs a separate change for netbsd-5. ]


reviewed by: oster, christos and darrenr


Revision tags: uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211
# 1.108 17-Nov-2009 jld

branches: 1.108.2; 1.108.4;
Finally commit the RAIDframe parity map Summer Of Code project.

Drastically reduces the amount of time spent rewriting parity after an
unclean shutdown by keeping better track of which regions might have had
outstanding writes. Enabled by default; can be disabled on a per-set
basis, or tuned, with the new raidctl(8) commands.

Discussed on tech-kern@ to a general air of approval; exhortations to
commit from mrg@, christos@, and others.

Thanks to Google for their sponsorship, oster@ for mentoring the
project, assorted developers for trying very hard to break it, and
probably more I'm forgetting.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base2 jym-xensuspend-base nick-hppapmap-base
# 1.107 11-Feb-2009 oster

If we see a RF_RECON_WRITE_ERROR event we know a write has finished and
we need to account for that. Failure to do so means we can end up
waiting forever for writes we think are outstanding, but which have
already completed.

Addresses the RAIDframe part of PR#40569. Thanks to Matthias Scheler
for reporting the issue and verifying the fix.


Revision tags: mjf-devfs2-base
# 1.106 20-Dec-2008 oster

branches: 1.106.2;
When unconfiguring an array where a reconstruct is in progress, abort
the reconstruct and wait for IOs to drain before pulling the plug.

Should fix the panic reported by der Mouse on tech-kern.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 haad-dm-base
# 1.105 23-Sep-2008 oster

branches: 1.105.2; 1.105.4;
Nuke unneeded printf(). Spotted by pooka@.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase wrstuden-revivesa-base
# 1.104 19-May-2008 oster

branches: 1.104.4;
Re-work some of the guts of the reconstruction code.

Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status pointers" are required for typical configurations. Convert
the reconstruction code to use a "sliding status window" which will
scale nicely regardless of the number of stripes/reconstruction units
in the RAID set. Convert the main reconstruction loop to rebuild the
array in chunks rather than in one big lump.

As part of these changes, introduce a function to kick any waiters on
the head separation callback list, and use that in the main
reconstruction event queue to wake up the waiters if things have
stalled. (I believe this may fix a race condition that could occur at
at least at the very end of a disk during reconstruction under heavy
IO load.)

Thanks to Brian Buhrow for all his help, support, and patience in
testing these changes.


Revision tags: yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.103 15-Apr-2008 oster

branches: 1.103.2; 1.103.4; 1.103.6;
A forced recon read should not default to indicating that the reads
for that disk have stopped, since this will bump us out of the normal
reconstruction loop prematurely.

Fixes the (mostly cosmetic) bug where the reconstruction
status values stop updating, and from raidctl it appears that
reconstruction has totally stalled (which it actually hasn't -- the
reconstruction does complete properly, but not in the normal way).


# 1.102 14-Apr-2008 oster

Print out the status value if a reconstruction read fails.
Don't print out write promotions during reconstruct unless
we are debugging reconstructs.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.101 26-Jan-2008 oster

branches: 1.101.6;
In a land before time, when kernel processes roamed the system, we
needed to keep track of the kernel process that opened a device in
order to close it with the right credentials. Flash forward to today
where curlwp is now quite sufficient.


Revision tags: bouyer-xeni386-merge1 vmlocking2-base3 bouyer-xeni386-nbase yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 bouyer-xeni386-base yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.100 26-Nov-2007 pooka

Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.99 21-Sep-2007 oster

branches: 1.99.6;
Fix wording in a comment and correct a debug line. From Olivier Cherrier
(via private mail). Thanks!


Revision tags: nick-csl-alignment-base5 matt-mips64-base
# 1.98 18-Jul-2007 ad

branches: 1.98.4; 1.98.6; 1.98.8;
Fix fallout from recent kthread changes.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.97 09-Jul-2007 ad

branches: 1.97.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.96 26-Jun-2007 cube

Change dk_lookup() to accept an additional argument of the type enum uio_seg
that tells whether the given path is in user space or kernel space, so it
can tell NDINIT().

While the raidframe calls were ok, both ccd(4) and cgd(4) were passing
pointers to user space data, which leads to strange error on i386, as
reported by Jukka Salmi on current-users.

The issue has been there since last august, I'm actually a bit surprised
that no one in the meantime has used ccd(4) or cgd(4) on an arch where it
would have simply faulted.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base ad-audiomp-base post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 newlock2-base netbsd-4-base
# 1.95 16-Nov-2006 christos

branches: 1.95.2; 1.95.8; 1.95.10; 1.95.16;
__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.94 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.93 27-Aug-2006 christos

branches: 1.93.2; 1.93.4;
- use dk_lookup instead of our home-spun version.
- allow raid to be configured in a wedge
- allow wedges to be configured in a raid
- add autoconfiguration of wedges in a raid


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.92 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.91 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.90 11-Dec-2005 christos

branches: 1.90.4; 1.90.6; 1.90.8; 1.90.10; 1.90.12;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.89 18-Jul-2005 oster

If rf_SubmitReconBuffer indicates the submission was blocked (for
whatever reason), return 0 instead of the default
RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result
in rf_ContinueReconstructFailedDisk() thinking that the given
component was "done" and breaking out of the main reconstruction loop
far too early. Reconstruction still worked correctly as long as there
were no errors, but RAIDframe wouldn't be in a position to properly
handle read/write errors during reconstruction.

This fixes the "raidctl's progress bar spins at 0% until
reconstruction finishes" problem.


# 1.88 08-Jun-2005 oster

branches: 1.88.2;
- initialize numRUsTotal before we indicate that we are doing a reconstruct.

- make numRUsComplete and numRUsTotal 64-bit quantities like
everything else that records this information.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.87 27-Feb-2005 perry

branches: 1.87.2;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.86 12-Feb-2005 oster

The 'next' argument to rf_CreateDiskQueueData is always NULL. Since
there is no particular reason to pass an extra NULL argument, turf it,
and initialize p->next to NULL within the function.


# 1.85 12-Feb-2005 oster

Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to
determine if we are willing to wait for memory to come from the
diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to
code calling rf_CreateDiskQueueData() with different expectations
(and/or blatent disregard) of what might happen if there were
insufficient pool resources.


# 1.84 06-Feb-2005 oster

It's not a bad idea to update the component labels whether or not the
reconstruction was successful.


# 1.83 05-Feb-2005 oster

rf_GetNextReconEvent() *will* return a valid event, so no need for
the assert. (we'd have panic'ed in there long before this assert
if that wasn't the case).

Minor whitespace changes.


# 1.82 05-Feb-2005 oster

Vastly improve the error handling in the case of a read/write error
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bailing and
leaving the system in the best, usable state possible.

- introduce rf_DrainReconEventQueue() to allow easy cleaning of the
reconstruction event queue

- change how we cleanup the floating recon buffers in
rf_FreeReconControl(). Detect the end of the list rather
than traversing according to a count.

- keep track of the number of pending reconstruction writes. In the
event of a read error, use this to wait long enough for the pending
writes to (hopefully) drain.

- more cleanup is still needed on this code, but I didn't want to
start mixing major functional changes with minor cleanups.

XXX: There is a known issue with pool items left outstanding due to
the IO failure, and this can show up in the form of a panic at the
tail end of a shutdown. This problem is much less severe than before
these changes, and the hope/plan is that this problem will go away
once this code gets overhauled again.


Revision tags: yamt-km-base
# 1.81 22-Jan-2005 oster

branches: 1.81.2;
Torch some #define's missed in last commit.


# 1.80 22-Jan-2005 oster

Reconstruction Descriptors are only allocated once per reconstruction,
and don't need their own pool or freelist or anything fancier than a
malloc/free.


# 1.79 18-Jan-2005 oster

ForceReconReadDoneProc() needs a return after doing the first
rf_CauseReconEvent().


Revision tags: kent-audio1-beforemerge
# 1.78 12-Dec-2004 oster

branches: 1.78.2;
The switch() in rf_ContinueReconstructFailedDisk() is never actually
used in non-simulation code, and thus is just wasting space (and
making the code more confusing to read!). Turf the switch, left-shift
the indentation of code, and nuke 'state' field of struct RF_RaidReconDesc_s.

No real functional changes.


Revision tags: kent-audio1-base
# 1.77 15-Nov-2004 oster

continueFunc and continueArg arn't used. Turf. Simplify calls to
rf_GetNextReconEvent().


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.76 18-Mar-2004 oster

branches: 1.76.4;
Re-work the locking mechanisms for reconstruct and PSS structures
such that we don't actually hold a simplelock while we are doing
a pool_get(), but that we still effectively protecting critical code.

This should fix all of the outstanding LOCKDEBUG warnings related to
rebuilding RAID sets.


# 1.75 13-Mar-2004 oster

- don't use rf_PrintUserStats() for recon statistics.
rf_PrintUserStats() was mean for the simulator, and doesn't provide
any real info in kernel-space, especially for reconstructs.
Reconstructing actually renders the stats even more useless, since it
resets them all to zero before the reconstruct starts!

- since rf_PrintUserStats() is no longer used, nuke it along with the
routines that feed it. Nothing was using this code, and if we ever
need it again, we know where to find it.


# 1.74 07-Mar-2004 oster

- Introduce rf_pools which contains all of the various global pools used
by RAIDframe. Convert all other RAIDframe global pools to use pools
defined within this new structure.
- Introduce rf_pool_init(), used for initializing a single pool in
RAIDframe. Teach each of the configuration routines to use
rf_pool_init().
- Cleanup a few pool-related comments.
- Cleanup revent initialization and #defines.
- Add a missing pool_destroy() for the reconbuffer pool.

(Saves another 1K off of an i386 GENERIC kernel, and makes
stuff a lot more readable)


# 1.73 07-Mar-2004 oster

- fix up initialization of rf_recond_pool
- introduce rf_reconbuffer_pool and teach rf_MakeReconBuffer() to use it


# 1.72 05-Mar-2004 oster

Use RF_INCLUDE_PARITY_DECLUSTERING_DS to #if-out more unneeded bits.
(We can't do RF_DISTRIBUTE_SPARE bits without the parity declustering stuff.)


# 1.71 03-Mar-2004 oster

Nuke some unnecessary casts. No functional changes.


# 1.70 03-Mar-2004 oster

Introduce RF_REVENT_READ_FAILED, RF_REVENT_WRITE_FAILED and RF_REVENT_FORCEREAD_FAILED.
This removes 3 more RF_PANIC()'s (but we'll currently still panic if any of these cases occur).
fix up a few printf's.
XXX: still needs more cleanup and testing (and be taught to not panic).


# 1.69 03-Mar-2004 oster

Cleanup function prototypes.


# 1.68 03-Mar-2004 oster

- cleanup memory allocation in rf_AllocPSStatus()
- change function signature of rf_LookupRUStatus(). The last argument
is now a pointer to a new PSS, in case one is needed. Rather than
having rf_LookupRUStatus() allocate a new PSS, we pre-allocate one
beforehand, where necessary, just in case.
- change callers of rf_lookupRUStatus() to deal with the new way of
calling rf_lookupRUStatus().

[no improvement or worsening of parity rebuild/initialization performance.]


# 1.67 01-Mar-2004 oster

Use RF_ACC_TRACE to #if out more chunks of code related only
to access tracing. (not turned on yet)


# 1.66 29-Feb-2004 oster

Adjust _rf_ShutdownCreate() so that it is willing to wait for more
memory. Since we only now ever "return(0)", just return (void)
instead.

Cleanup all uses of rf_ShutdownCreate() to not worry about
it ever failing. Shaves another 600 bytes off of an i386 GENERIC kernel.


# 1.65 04-Jan-2004 oster

raidPtr->reconControl->percentCompleted only gets used in one
debugging printf, and in rf_netbsdkintf.c. We can do the calculations
inside of RF_DEBUG_RECON for the one debugging printf, and only
perform the percentCompleted calculation "on demand" in the
rf_netbsdkintf.c case. Shaves a few more bytes off an i386 GENERIC
kernel, and ever-so-slightly decreases the amount of work performed
during a reconstruct.


# 1.64 31-Dec-2003 oster

Add in a bunch of RF_SIGNAL_COND()'s that were missing. Tidy up
a few lines.


# 1.63 31-Dec-2003 oster

Left-shift another else{} chunk. No functional changes.


# 1.62 31-Dec-2003 oster

left-shift the "else" part of the if(!lp_SubmitReconBuffer) condition.
Cleanup. No real functional changes, just more readable.


# 1.61 31-Dec-2003 oster

Negate a condition, and flip if/else parts. Preparation for left-shifting
the (now) else part. No real functional change.


# 1.60 30-Dec-2003 oster

Some days you wonder if some of the function declaration consistency
was just an accident in the first place. Cleanup function decls and
a few comments. [ok.. so I wasn't going to fix this many.. but once
you're on a roll....]


# 1.59 29-Dec-2003 oster

Let's see... raidPtr->recon_done_procs is never set to anything
(other than NULL when raidPtr is initialized). That means
SignalReconDone() never does anything useful. Bye-bye!

Say good-bye to recon_done_procs and recon_done_procs_mutex (and its
initializer) as well.


# 1.58 29-Dec-2003 oster

- first kick at a major reworking of RAIDframe's memory allocation code:
- all freelists converted to pools
- initialization of structure members in certain cases where
code was relying on specific allocation and usage properties
to keep structures in a "known state" (that doesn't work with
pools!).
- make most pool_get() be "PR_WAITOK" until they can be analyzed
further, and/or have proper error handling added.
- all RF_Mallocs zero the space returned, so there is no difference
between RF_Calloc and RF_Malloc. In fact, all the RF_Calloc()'s
do is tend to do is get things horribly confused.
Make RF_Malloc() the "general memory allocator", with
RF_MallocAndAdd() the "general memory allocator with
allocation list".
- some of these RF_Malloc's et al. are destined to disappear.
- remove rf_rdp_freelist entirely (it's not used anywhere!)
- remove: #include "rf_freelist.h"
- to the files that were relying on the above, add: #include "rf_general.h"
- add: #include "rf_debugMem.h" to rf_shutdown.h to make it happy
about the loss of: #include "rf_freelist.h".

This shrinks an i386 GENERIC kernel by approx 5K. RAIDframe now
weighs in at about 162K on i386.


# 1.57 29-Dec-2003 oster

[Having received a definite lack of strenuous objection, a small amount
of strenuous agreement, and some general agreement, this commit is
going ahead because it's now starting to block some other changes I
wish to make.]

Remove most of the support for the concept of "rows" from RAIDframe.
While the "row" interface has been exported to the world, RAIDframe
internals have really only supported a single row, even though they
have feigned support of multiple rows.

Nothing changes in configuration land -- config files still need to
specify a single row, etc. All auto-config structures remain fully
forward/backwards compatible.

The only visible difference to the average user should be a
reduction in the size of a GENERIC kernel (i386) by 4.5K. For those
of us trolling through RAIDframe kernel code, a lot of the driver
configuration code has become a LOT easier to read.


# 1.56 29-Jun-2003 fvdl

branches: 1.56.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.55 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.54 10-Apr-2003 simonb

Remove an assigned-to but unused variable.


# 1.53 21-Mar-2003 dsl

Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).


# 1.52 09-Feb-2003 jdolecek

constify some


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.51 19-Nov-2002 oster

For reconstructs, move checks for failed components to before the
kernel threads are created.


# 1.50 16-Nov-2002 oster

Cleanup more printfs.


# 1.49 15-Nov-2002 oster

After a rebuild-in-place, a reconstruct, or a copyback, we should
really be updating the component labels.


Revision tags: kqueue-aftermerge kqueue-beforemerge
# 1.48 18-Oct-2002 oster

Improve and/or re-arrange a number of locks. While much of the locking is
still a mess, and there are a number of unresolved issues here, this
gets us closer to being happier in LOCKDEBUG land.


# 1.47 06-Oct-2002 oster

Add a missing RF_LOCK_MUTEX().


# 1.46 06-Oct-2002 oster

Introduce a temp variable, and allocate the ReconCtrl structure before
we protect raidPtr. One less thing for LOCKDEBUG to complain about.


Revision tags: kqueue-base
# 1.45 23-Sep-2002 oster

Nuke "baddisk". Thanks to Simon B.


# 1.44 21-Sep-2002 oster

rf_RegisterReconDoneProc() isn't needed.

This is the last of the 'easy' ones that Krister made me aware of.
Total savings on i386 GENERIC kernel: 13151 bytes
RAIDframe in GENERIC is now at: 179033
Thanks again Krister!


# 1.43 19-Sep-2002 oster

Introduce and use RF_DEBUG_PSS, and save a few more bytes.


# 1.42 19-Sep-2002 oster

One signal will do, thanks.


# 1.41 17-Sep-2002 oster

Cast the RF_DEBUG_RECON net a little wider.


# 1.40 17-Sep-2002 oster

Rename RF_DEBUG_RECONBUFFER to RF_DEBUG_RECON in order to facilitate
disabling other stuff without having to introduce another #define.


# 1.39 16-Sep-2002 oster

Cleanup some comments.


# 1.38 16-Sep-2002 oster

rf_CheckFloatingRbufCount() is only really useful when debugging the
reconstruct buffer stuff. #if it out in the general case.


# 1.37 16-Sep-2002 oster

Cleanup some printf's, and disable some (debugging) output.


# 1.36 14-Sep-2002 oster

Everyone and their dog was using RF_ERRORMSG3 to print out the same
sort of error message, over and over again, in different files.
Rather than having the same text repeated in multiple .o files,
create a couple of little functions to do the printing, and save a
bundle of space. Also improves readability of code.


# 1.35 09-Sep-2002 oster

Disallow 'reconstruct-in-place' on a component that has failed
and has already been reconstructed to a hot spare.


Revision tags: gehenna-devsw-base
# 1.34 13-Jul-2002 oster

Nuke a redundant wakeup().


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.33 09-Jan-2002 oster

branches: 1.33.8;
Move a bunch of debugging stuff to be only used if DEBUG is turned on.


# 1.32 15-Nov-2001 lukem

don't need <sys/types.h> when including <sys/param.h>


# 1.31 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3
# 1.30 04-Oct-2001 oster

Step 2 of the disentanglement. We now look to <dev/raidframe/*> for
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.

No functional changes to RAIDframe.


Revision tags: thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.29 18-Jul-2001 thorpej

branches: 1.29.2;
bzero -> memset


# 1.28 14-Jun-2001 oster

branches: 1.28.2;
It's silly to need a parity rebuild after a reconstruction has completed.
If we've just reconstructed a disk, then the parity is known to
be correct. (XXX doesn't hold for RAID 6!)


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.27 26-Jan-2001 oster

branches: 1.27.2;
Ensure we update the 'partitionSize' field of the component labels
when doing a reconstruct or a copyback. If we don't, junk might be
there, and that could cause the component to be not correctly
autoconfigured on reboot. Thanks to Simon Burge for helping track this down.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.26 04-Jun-2000 oster

branches: 1.26.2;
Merge rf_update_component_labels() and rf_final_update_component_labels().


# 1.25 31-May-2000 oster

Oops.. reconstruction percentages were being reported incorrectly.
Thanks to Manuel Bouyer for noting this.


# 1.24 28-May-2000 oster

Umm.. Complete is not equal to 'left to do'. Fix the math.


# 1.23 28-May-2000 oster

- Add a mechanism for obtaining finer-grained 'progress' information
regarding reconstructs, copybacks, etc.

- RAID 0 doesn't do copybacks, but don't make raidctl sweat about it.


Revision tags: minoura-xpg4dl-base
# 1.22 13-Mar-2000 soren

branches: 1.22.2;
Fix doubled 'the's in comments.


# 1.21 07-Mar-2000 oster

Create a new rf_close_component() to handle vnode operations for closing
components. Teach rf_UnconfigureVnodes() how to use it, and tell
the copyback and reconstruction code about it too.


# 1.20 25-Feb-2000 oster

When we close autoconfigured components, we need to note that they
are no longer in 'autoconfigured' status.


# 1.19 25-Feb-2000 oster

Fix a (slightly) bogus status message.


# 1.18 24-Feb-2000 oster

Make sure we close auto-configured components appropriately when
attempting a rebuild-in-place.


# 1.17 23-Feb-2000 oster

Be more aggressive about updating component labels in the event
of a real component failure (or a simulated failure):
- add 'numNewFailures' to keep track of the number of disk failures
since mod_counter was last updated for each component label.
- make sure we call rf_update_component_labels() upon any component failure,
real or simulated.


# 1.16 23-Feb-2000 oster

Do a better job of (re)initializing the component labels after
a reconstruct or a copyback.


Revision tags: chs-ubc2-newbase
# 1.15 13-Feb-2000 oster

Get recent changes into the tree:
- make component_label variables more consistent (==> clabel)
- re-work incorrect component configuration code
- re-work disk configuration code
- cleanup initial configuration of raidPtr info
- add auto-detection of components and RAID sets (Disabled, for now)
- allow / on RAID sets (Disabled, for now)
- rename "config_disk_queue" to "rf_ConfigureDiskQueue" and properly prototype
in rf_diskqueue.h
- protect some headers with #if _KERNEL (XXX this needs to be fixed properly)
and cleanup header formatting.
- expand the component labels (yes, they should be backward/forward compatible)
- other bits and pieces (some function names are still bogus, and will get
changed soon)


# 1.14 09-Jan-2000 oster

Nuke dependencies on rf_cpuutils.h.


# 1.13 09-Jan-2000 oster

Nuke unused debugging stuff. Clean up a whole bunch of comments.


# 1.12 09-Jan-2000 oster

- move a bunch of function prototypes to rf_kintf.h
- general cleanup of a number of prototypes that were scattered around.


# 1.11 09-Jan-2000 oster

Nuke #if 0'ed code.


# 1.10 08-Jan-2000 oster

- nuke calls to rf_get_threadid() and associated #include
- change a bunch of debugging printfs from
"[%d] ...", tid (where tid is the "thread id")
to
"raid%d: ...", raidPtr->raidid
- other minor rototillage


# 1.9 05-Jan-2000 oster

- update RF_CREATE_THREAD to handle a 'process name' argument.
- fire up a new thread for parity re-writes, copybacks, and reconstructs.
The ioctl's which trigger these actions now return immediately.
- add progress accounting for the above actions.
- minor rototillage of rf_netbsdkintf.c to deal with all of the above.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base
# 1.8 14-Aug-1999 oster

branches: 1.8.2;
Remove a 'struct proc *'-passing abomination that's been bugging me
for quite some time.


# 1.7 13-Aug-1999 oster

rf_sys.h does not need to be #included in any of these files, and, actually,
is no longer needed at all.


# 1.6 13-Aug-1999 oster

Clean up reconstruction accounting a bit. While it worked before, it was
slightly broken in the case where the RAID set did not support reconstruction.


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 chs-ubc2-base netbsd-1-4-RELEASE netbsd-1-4-base
# 1.5 02-Mar-1999 oster

branches: 1.5.2;
Update for recent changes including component label support, clean
bits, rebuilding components in-place, adding hot spares, shutdownhooks, etc.


# 1.4 05-Feb-1999 oster

Phase 2 of the RAIDframe cleanup. The source is now closer to KNF
and is much easier to read. No functionality changes.


# 1.3 26-Jan-1999 oster

Nuke more bits of RAIDframe "demo" code. We're not "demoing" here,
we're doing the Real Thing!


# 1.2 26-Jan-1999 oster

RAIDframe cleanup, phase 1. Nuke simulator support, user-land driver,
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.

No functional changes to the kernel code in this commit.


Revision tags: kenh-if-detach-base
# 1.1 13-Nov-1998 oster

RAIDframe, version 1.1, from the Parallel Data Laboratory at
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.


# 1.127 27-Jul-2021 oster

rf_CreateDiskQueueData() no longer uses waitflag, and will always succeed.
Cleanup the error path for the (no longer needed) PR_NOWAIT cases.


# 1.126 23-Jul-2021 oster

Extensive mechanical changes to the pools used in RAIDframe.

Alloclist remains not per-RAID, so initialize that pool
separately/differently than the rest.

The remainder of pools in RF_Pools_s are now per-RAID pools. Mostly
mechanical changes to functions to allocate/destroy per-RAID pools.
Needed to make raidPtr available in certain cases to be able to find
the per-RAID pools.

Extend rf_pool_init() to now populate a per-RAID wchan value that is
unique to each pool for a given RAID device.

TODO: Complete the analysis of the minimum number of items that are
required for each pool to allow IO to progress (i.e. so that a request
for pool resources can always be satisfied), and dynamically scale
minimum pool sizes based on RAID configuration.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.125 15-Feb-2021 oster

Fix a long long-standing off-by-one error in computing lastPSID.

SUsPerPU is only really supported for a value of 1, and since the
first PSID is 0, the last will be numStripe-1. Also update the
setting of pending_writes to reflect the change to lastPSID.

Needs pullups to -8 and -9.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3 ad-namecache-base2 ad-namecache-base1 ad-namecache-base
# 1.124 08-Dec-2019 mlelstv

branches: 1.124.8;
Switch to vn_bdev_open* functions.


Revision tags: phil-wifi-20191119
# 1.123 10-Oct-2019 christos

fix the function pointer and callback mess:
- callback functions return 0 and their result is not checked; make them void.
- there are two types of callbacks and they used to overload their parameters
and the callback structure; separate them into "function" and "value"
callbacks.
- make the wait function signature consistent.


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.122 09-Feb-2019 christos

branches: 1.122.4;
- Change the allocation macros to be more like function calls
- Change sizeof(type) -> sizeof(*variable)
- Use macros for the long buffer length allocations
- Remove "bit polishing" memsets() -- do them only once
- Remove unnecessary casts

Thanks to oster@ for finding bugs and testing.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 pgoyette-compat-merge-20190127 pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 netbsd-8-0-RELEASE phil-wifi-base pgoyette-compat-0625 netbsd-8-0-RC2 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 netbsd-8-0-RC1 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226 nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base
# 1.121 14-Nov-2014 oster

branches: 1.121.12; 1.121.20;


Fix a long-standing bug related to rebooting while a
reconstruct-to-spare is underway but not yet complete.

The issue was that a component was being marked as a used_spare when
the rebuild started, not when the rebuild was actually finished.
Marking it as a used_spare meant that the component label on the spare
was being updated such that after a reboot the component would be
considered up-to-date, regardless of whether the rebuild actually
completed!

This fix includes:
1) Add an additional state "rf_ds_rebuilding_spare" which is used
to denote that a spare is currently being rebuilt from the live
components.
2) Update the comments on the disk states, which were out-of-sync
with reality.
3) When rebuilding to a spare component, that spare now enters the
state rf_ds_rebuilding_spare instead of the state rf_ds_used_spare.
4) When the rebuild is actually complete then the spare component
enters the rf_ds_used_spare state. rf_ds_used_spare is now used
exclusively for the case where the rebuilding to the spare has
completed successfully.

XXX: Someday we need to teach raidctl(8) about this new state, and
take out the backwards compatibility code in rf_netbsdkintf.c (see
RAIDFRAME_GET_INFO in raidioctl()). For today, this fix needs to be
generic enough that it can get backported without major grief.

XXX: Needs pullup to netbsd-5*, netbsd-6*, and netbsd-7

Fixes PR#49244.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.120 14-Jun-2014 hannken

branches: 1.120.2;
Change dk_lookup() to return an anonymous vnode not associated with
any file system. Change all consumers of dk_lookup() to get the
device from "v_rdev" instead of VOP_GETATTR() as specfs does not
support VOP_GETATTR(). Devices obtained with dk_lookup() will no
longer disappear on forced unmounts.

Fix for PR kern/48849 (root mirror raid fails on shutdown)

Welcome to 6.99.44


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base agc-symver-base
# 1.119 06-Mar-2013 yamt

branches: 1.119.10;
fix parens in a message


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.118 20-Feb-2012 oster

branches: 1.118.2;
Add logic to the main reconstruction loop to handle RAID5 with rotated
spares. While here, observe that we were actually doing one more
stripe than we thought we were, and correct that too (it didn't matter
for non-RAID5_RS, but it definitely does for RAID5_RS). Add some
bounds-checking at the beginning to handle the case where the number
of stripes in the set is smaller than the sliding reconstruction window.

XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.117 14-Oct-2011 hannken

branches: 1.117.2; 1.117.6; 1.117.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.116 03-Aug-2011 oster

Address part of PR kern/44972. From YAMAMOTO Takashi. Thanks!


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.115 28-May-2011 yamt

rf_ReconstructInPlace: don't leave a vnode open on errors.
fixes a part of PR/44972.


# 1.114 24-May-2011 buhrow

Suggested to oster@ and approved via private e-mail as a help to
people who are getting reconstruction failures.


# 1.113 11-May-2011 mrg

convert the main raidPtr mutex to a kmutex, and add a couple of cv's to
cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond.
convert all remaining simple_lock's to kmutexes (they're not used or compiled
right now... even with all options enabled) and remove the support for them.

this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs.


# 1.112 02-May-2011 mrg

convert rb_mutex to a kmutex/cv.


Revision tags: bouyer-quota2-nbase
# 1.111 19-Feb-2011 enami

Define accessors for number of blocks and partition size in the
component label and use them where appropriate. Disscussed on tech-kern.


Revision tags: bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.110 19-Nov-2010 dholland

branches: 1.110.2; 1.110.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4
# 1.109 01-Nov-2010 mrg

add support for >2TB raid devices.

- add two new members to the component label:
u_int numBlocksHi
u_int partitionSizeHi
and store the top 32 bits of the real number of blocks and
partition size. modify rf_print_component_label(),
rf_does_it_fit(), rf_AutoConfigureDisks() and
rf_ReconstructFailedDiskBasic().

- call disk_blocksize() after disk_attach() [ from mlelstv ]

- shift the block number relative to DEV_BSHIFT in raidstart()
and InitBP() so that accesses work for non 512-byte devices.
[ from mlelstv ]

- update rf_getdisksize() to use the new getdisksize() [ from
mlelstv. this part needs a separate change for netbsd-5. ]


reviewed by: oster, christos and darrenr


Revision tags: uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211
# 1.108 17-Nov-2009 jld

branches: 1.108.2; 1.108.4;
Finally commit the RAIDframe parity map Summer Of Code project.

Drastically reduces the amount of time spent rewriting parity after an
unclean shutdown by keeping better track of which regions might have had
outstanding writes. Enabled by default; can be disabled on a per-set
basis, or tuned, with the new raidctl(8) commands.

Discussed on tech-kern@ to a general air of approval; exhortations to
commit from mrg@, christos@, and others.

Thanks to Google for their sponsorship, oster@ for mentoring the
project, assorted developers for trying very hard to break it, and
probably more I'm forgetting.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base2 jym-xensuspend-base nick-hppapmap-base
# 1.107 11-Feb-2009 oster

If we see a RF_RECON_WRITE_ERROR event we know a write has finished and
we need to account for that. Failure to do so means we can end up
waiting forever for writes we think are outstanding, but which have
already completed.

Addresses the RAIDframe part of PR#40569. Thanks to Matthias Scheler
for reporting the issue and verifying the fix.


Revision tags: mjf-devfs2-base
# 1.106 20-Dec-2008 oster

branches: 1.106.2;
When unconfiguring an array where a reconstruct is in progress, abort
the reconstruct and wait for IOs to drain before pulling the plug.

Should fix the panic reported by der Mouse on tech-kern.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 haad-dm-base
# 1.105 23-Sep-2008 oster

branches: 1.105.2; 1.105.4;
Nuke unneeded printf(). Spotted by pooka@.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase wrstuden-revivesa-base
# 1.104 19-May-2008 oster

branches: 1.104.4;
Re-work some of the guts of the reconstruction code.

Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status pointers" are required for typical configurations. Convert
the reconstruction code to use a "sliding status window" which will
scale nicely regardless of the number of stripes/reconstruction units
in the RAID set. Convert the main reconstruction loop to rebuild the
array in chunks rather than in one big lump.

As part of these changes, introduce a function to kick any waiters on
the head separation callback list, and use that in the main
reconstruction event queue to wake up the waiters if things have
stalled. (I believe this may fix a race condition that could occur at
at least at the very end of a disk during reconstruction under heavy
IO load.)

Thanks to Brian Buhrow for all his help, support, and patience in
testing these changes.


Revision tags: yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.103 15-Apr-2008 oster

branches: 1.103.2; 1.103.4; 1.103.6;
A forced recon read should not default to indicating that the reads
for that disk have stopped, since this will bump us out of the normal
reconstruction loop prematurely.

Fixes the (mostly cosmetic) bug where the reconstruction
status values stop updating, and from raidctl it appears that
reconstruction has totally stalled (which it actually hasn't -- the
reconstruction does complete properly, but not in the normal way).


# 1.102 14-Apr-2008 oster

Print out the status value if a reconstruction read fails.
Don't print out write promotions during reconstruct unless
we are debugging reconstructs.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.101 26-Jan-2008 oster

branches: 1.101.6;
In a land before time, when kernel processes roamed the system, we
needed to keep track of the kernel process that opened a device in
order to close it with the right credentials. Flash forward to today
where curlwp is now quite sufficient.


Revision tags: bouyer-xeni386-merge1 vmlocking2-base3 bouyer-xeni386-nbase yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 bouyer-xeni386-base yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.100 26-Nov-2007 pooka

Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.99 21-Sep-2007 oster

branches: 1.99.6;
Fix wording in a comment and correct a debug line. From Olivier Cherrier
(via private mail). Thanks!


Revision tags: nick-csl-alignment-base5 matt-mips64-base
# 1.98 18-Jul-2007 ad

branches: 1.98.4; 1.98.6; 1.98.8;
Fix fallout from recent kthread changes.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.97 09-Jul-2007 ad

branches: 1.97.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.96 26-Jun-2007 cube

Change dk_lookup() to accept an additional argument of the type enum uio_seg
that tells whether the given path is in user space or kernel space, so it
can tell NDINIT().

While the raidframe calls were ok, both ccd(4) and cgd(4) were passing
pointers to user space data, which leads to strange error on i386, as
reported by Jukka Salmi on current-users.

The issue has been there since last august, I'm actually a bit surprised
that no one in the meantime has used ccd(4) or cgd(4) on an arch where it
would have simply faulted.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base ad-audiomp-base post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 newlock2-base netbsd-4-base
# 1.95 16-Nov-2006 christos

branches: 1.95.2; 1.95.8; 1.95.10; 1.95.16;
__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.94 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.93 27-Aug-2006 christos

branches: 1.93.2; 1.93.4;
- use dk_lookup instead of our home-spun version.
- allow raid to be configured in a wedge
- allow wedges to be configured in a raid
- add autoconfiguration of wedges in a raid


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.92 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.91 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.90 11-Dec-2005 christos

branches: 1.90.4; 1.90.6; 1.90.8; 1.90.10; 1.90.12;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.89 18-Jul-2005 oster

If rf_SubmitReconBuffer indicates the submission was blocked (for
whatever reason), return 0 instead of the default
RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result
in rf_ContinueReconstructFailedDisk() thinking that the given
component was "done" and breaking out of the main reconstruction loop
far too early. Reconstruction still worked correctly as long as there
were no errors, but RAIDframe wouldn't be in a position to properly
handle read/write errors during reconstruction.

This fixes the "raidctl's progress bar spins at 0% until
reconstruction finishes" problem.


# 1.88 08-Jun-2005 oster

branches: 1.88.2;
- initialize numRUsTotal before we indicate that we are doing a reconstruct.

- make numRUsComplete and numRUsTotal 64-bit quantities like
everything else that records this information.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.87 27-Feb-2005 perry

branches: 1.87.2;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.86 12-Feb-2005 oster

The 'next' argument to rf_CreateDiskQueueData is always NULL. Since
there is no particular reason to pass an extra NULL argument, turf it,
and initialize p->next to NULL within the function.


# 1.85 12-Feb-2005 oster

Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to
determine if we are willing to wait for memory to come from the
diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to
code calling rf_CreateDiskQueueData() with different expectations
(and/or blatent disregard) of what might happen if there were
insufficient pool resources.


# 1.84 06-Feb-2005 oster

It's not a bad idea to update the component labels whether or not the
reconstruction was successful.


# 1.83 05-Feb-2005 oster

rf_GetNextReconEvent() *will* return a valid event, so no need for
the assert. (we'd have panic'ed in there long before this assert
if that wasn't the case).

Minor whitespace changes.


# 1.82 05-Feb-2005 oster

Vastly improve the error handling in the case of a read/write error
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bailing and
leaving the system in the best, usable state possible.

- introduce rf_DrainReconEventQueue() to allow easy cleaning of the
reconstruction event queue

- change how we cleanup the floating recon buffers in
rf_FreeReconControl(). Detect the end of the list rather
than traversing according to a count.

- keep track of the number of pending reconstruction writes. In the
event of a read error, use this to wait long enough for the pending
writes to (hopefully) drain.

- more cleanup is still needed on this code, but I didn't want to
start mixing major functional changes with minor cleanups.

XXX: There is a known issue with pool items left outstanding due to
the IO failure, and this can show up in the form of a panic at the
tail end of a shutdown. This problem is much less severe than before
these changes, and the hope/plan is that this problem will go away
once this code gets overhauled again.


Revision tags: yamt-km-base
# 1.81 22-Jan-2005 oster

branches: 1.81.2;
Torch some #define's missed in last commit.


# 1.80 22-Jan-2005 oster

Reconstruction Descriptors are only allocated once per reconstruction,
and don't need their own pool or freelist or anything fancier than a
malloc/free.


# 1.79 18-Jan-2005 oster

ForceReconReadDoneProc() needs a return after doing the first
rf_CauseReconEvent().


Revision tags: kent-audio1-beforemerge
# 1.78 12-Dec-2004 oster

branches: 1.78.2;
The switch() in rf_ContinueReconstructFailedDisk() is never actually
used in non-simulation code, and thus is just wasting space (and
making the code more confusing to read!). Turf the switch, left-shift
the indentation of code, and nuke 'state' field of struct RF_RaidReconDesc_s.

No real functional changes.


Revision tags: kent-audio1-base
# 1.77 15-Nov-2004 oster

continueFunc and continueArg arn't used. Turf. Simplify calls to
rf_GetNextReconEvent().


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.76 18-Mar-2004 oster

branches: 1.76.4;
Re-work the locking mechanisms for reconstruct and PSS structures
such that we don't actually hold a simplelock while we are doing
a pool_get(), but that we still effectively protecting critical code.

This should fix all of the outstanding LOCKDEBUG warnings related to
rebuilding RAID sets.


# 1.75 13-Mar-2004 oster

- don't use rf_PrintUserStats() for recon statistics.
rf_PrintUserStats() was mean for the simulator, and doesn't provide
any real info in kernel-space, especially for reconstructs.
Reconstructing actually renders the stats even more useless, since it
resets them all to zero before the reconstruct starts!

- since rf_PrintUserStats() is no longer used, nuke it along with the
routines that feed it. Nothing was using this code, and if we ever
need it again, we know where to find it.


# 1.74 07-Mar-2004 oster

- Introduce rf_pools which contains all of the various global pools used
by RAIDframe. Convert all other RAIDframe global pools to use pools
defined within this new structure.
- Introduce rf_pool_init(), used for initializing a single pool in
RAIDframe. Teach each of the configuration routines to use
rf_pool_init().
- Cleanup a few pool-related comments.
- Cleanup revent initialization and #defines.
- Add a missing pool_destroy() for the reconbuffer pool.

(Saves another 1K off of an i386 GENERIC kernel, and makes
stuff a lot more readable)


# 1.73 07-Mar-2004 oster

- fix up initialization of rf_recond_pool
- introduce rf_reconbuffer_pool and teach rf_MakeReconBuffer() to use it


# 1.72 05-Mar-2004 oster

Use RF_INCLUDE_PARITY_DECLUSTERING_DS to #if-out more unneeded bits.
(We can't do RF_DISTRIBUTE_SPARE bits without the parity declustering stuff.)


# 1.71 03-Mar-2004 oster

Nuke some unnecessary casts. No functional changes.


# 1.70 03-Mar-2004 oster

Introduce RF_REVENT_READ_FAILED, RF_REVENT_WRITE_FAILED and RF_REVENT_FORCEREAD_FAILED.
This removes 3 more RF_PANIC()'s (but we'll currently still panic if any of these cases occur).
fix up a few printf's.
XXX: still needs more cleanup and testing (and be taught to not panic).


# 1.69 03-Mar-2004 oster

Cleanup function prototypes.


# 1.68 03-Mar-2004 oster

- cleanup memory allocation in rf_AllocPSStatus()
- change function signature of rf_LookupRUStatus(). The last argument
is now a pointer to a new PSS, in case one is needed. Rather than
having rf_LookupRUStatus() allocate a new PSS, we pre-allocate one
beforehand, where necessary, just in case.
- change callers of rf_lookupRUStatus() to deal with the new way of
calling rf_lookupRUStatus().

[no improvement or worsening of parity rebuild/initialization performance.]


# 1.67 01-Mar-2004 oster

Use RF_ACC_TRACE to #if out more chunks of code related only
to access tracing. (not turned on yet)


# 1.66 29-Feb-2004 oster

Adjust _rf_ShutdownCreate() so that it is willing to wait for more
memory. Since we only now ever "return(0)", just return (void)
instead.

Cleanup all uses of rf_ShutdownCreate() to not worry about
it ever failing. Shaves another 600 bytes off of an i386 GENERIC kernel.


# 1.65 04-Jan-2004 oster

raidPtr->reconControl->percentCompleted only gets used in one
debugging printf, and in rf_netbsdkintf.c. We can do the calculations
inside of RF_DEBUG_RECON for the one debugging printf, and only
perform the percentCompleted calculation "on demand" in the
rf_netbsdkintf.c case. Shaves a few more bytes off an i386 GENERIC
kernel, and ever-so-slightly decreases the amount of work performed
during a reconstruct.


# 1.64 31-Dec-2003 oster

Add in a bunch of RF_SIGNAL_COND()'s that were missing. Tidy up
a few lines.


# 1.63 31-Dec-2003 oster

Left-shift another else{} chunk. No functional changes.


# 1.62 31-Dec-2003 oster

left-shift the "else" part of the if(!lp_SubmitReconBuffer) condition.
Cleanup. No real functional changes, just more readable.


# 1.61 31-Dec-2003 oster

Negate a condition, and flip if/else parts. Preparation for left-shifting
the (now) else part. No real functional change.


# 1.60 30-Dec-2003 oster

Some days you wonder if some of the function declaration consistency
was just an accident in the first place. Cleanup function decls and
a few comments. [ok.. so I wasn't going to fix this many.. but once
you're on a roll....]


# 1.59 29-Dec-2003 oster

Let's see... raidPtr->recon_done_procs is never set to anything
(other than NULL when raidPtr is initialized). That means
SignalReconDone() never does anything useful. Bye-bye!

Say good-bye to recon_done_procs and recon_done_procs_mutex (and its
initializer) as well.


# 1.58 29-Dec-2003 oster

- first kick at a major reworking of RAIDframe's memory allocation code:
- all freelists converted to pools
- initialization of structure members in certain cases where
code was relying on specific allocation and usage properties
to keep structures in a "known state" (that doesn't work with
pools!).
- make most pool_get() be "PR_WAITOK" until they can be analyzed
further, and/or have proper error handling added.
- all RF_Mallocs zero the space returned, so there is no difference
between RF_Calloc and RF_Malloc. In fact, all the RF_Calloc()'s
do is tend to do is get things horribly confused.
Make RF_Malloc() the "general memory allocator", with
RF_MallocAndAdd() the "general memory allocator with
allocation list".
- some of these RF_Malloc's et al. are destined to disappear.
- remove rf_rdp_freelist entirely (it's not used anywhere!)
- remove: #include "rf_freelist.h"
- to the files that were relying on the above, add: #include "rf_general.h"
- add: #include "rf_debugMem.h" to rf_shutdown.h to make it happy
about the loss of: #include "rf_freelist.h".

This shrinks an i386 GENERIC kernel by approx 5K. RAIDframe now
weighs in at about 162K on i386.


# 1.57 29-Dec-2003 oster

[Having received a definite lack of strenuous objection, a small amount
of strenuous agreement, and some general agreement, this commit is
going ahead because it's now starting to block some other changes I
wish to make.]

Remove most of the support for the concept of "rows" from RAIDframe.
While the "row" interface has been exported to the world, RAIDframe
internals have really only supported a single row, even though they
have feigned support of multiple rows.

Nothing changes in configuration land -- config files still need to
specify a single row, etc. All auto-config structures remain fully
forward/backwards compatible.

The only visible difference to the average user should be a
reduction in the size of a GENERIC kernel (i386) by 4.5K. For those
of us trolling through RAIDframe kernel code, a lot of the driver
configuration code has become a LOT easier to read.


# 1.56 29-Jun-2003 fvdl

branches: 1.56.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.55 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.54 10-Apr-2003 simonb

Remove an assigned-to but unused variable.


# 1.53 21-Mar-2003 dsl

Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).


# 1.52 09-Feb-2003 jdolecek

constify some


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.51 19-Nov-2002 oster

For reconstructs, move checks for failed components to before the
kernel threads are created.


# 1.50 16-Nov-2002 oster

Cleanup more printfs.


# 1.49 15-Nov-2002 oster

After a rebuild-in-place, a reconstruct, or a copyback, we should
really be updating the component labels.


Revision tags: kqueue-aftermerge kqueue-beforemerge
# 1.48 18-Oct-2002 oster

Improve and/or re-arrange a number of locks. While much of the locking is
still a mess, and there are a number of unresolved issues here, this
gets us closer to being happier in LOCKDEBUG land.


# 1.47 06-Oct-2002 oster

Add a missing RF_LOCK_MUTEX().


# 1.46 06-Oct-2002 oster

Introduce a temp variable, and allocate the ReconCtrl structure before
we protect raidPtr. One less thing for LOCKDEBUG to complain about.


Revision tags: kqueue-base
# 1.45 23-Sep-2002 oster

Nuke "baddisk". Thanks to Simon B.


# 1.44 21-Sep-2002 oster

rf_RegisterReconDoneProc() isn't needed.

This is the last of the 'easy' ones that Krister made me aware of.
Total savings on i386 GENERIC kernel: 13151 bytes
RAIDframe in GENERIC is now at: 179033
Thanks again Krister!


# 1.43 19-Sep-2002 oster

Introduce and use RF_DEBUG_PSS, and save a few more bytes.


# 1.42 19-Sep-2002 oster

One signal will do, thanks.


# 1.41 17-Sep-2002 oster

Cast the RF_DEBUG_RECON net a little wider.


# 1.40 17-Sep-2002 oster

Rename RF_DEBUG_RECONBUFFER to RF_DEBUG_RECON in order to facilitate
disabling other stuff without having to introduce another #define.


# 1.39 16-Sep-2002 oster

Cleanup some comments.


# 1.38 16-Sep-2002 oster

rf_CheckFloatingRbufCount() is only really useful when debugging the
reconstruct buffer stuff. #if it out in the general case.


# 1.37 16-Sep-2002 oster

Cleanup some printf's, and disable some (debugging) output.


# 1.36 14-Sep-2002 oster

Everyone and their dog was using RF_ERRORMSG3 to print out the same
sort of error message, over and over again, in different files.
Rather than having the same text repeated in multiple .o files,
create a couple of little functions to do the printing, and save a
bundle of space. Also improves readability of code.


# 1.35 09-Sep-2002 oster

Disallow 'reconstruct-in-place' on a component that has failed
and has already been reconstructed to a hot spare.


Revision tags: gehenna-devsw-base
# 1.34 13-Jul-2002 oster

Nuke a redundant wakeup().


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.33 09-Jan-2002 oster

branches: 1.33.8;
Move a bunch of debugging stuff to be only used if DEBUG is turned on.


# 1.32 15-Nov-2001 lukem

don't need <sys/types.h> when including <sys/param.h>


# 1.31 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3
# 1.30 04-Oct-2001 oster

Step 2 of the disentanglement. We now look to <dev/raidframe/*> for
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.

No functional changes to RAIDframe.


Revision tags: thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.29 18-Jul-2001 thorpej

branches: 1.29.2;
bzero -> memset


# 1.28 14-Jun-2001 oster

branches: 1.28.2;
It's silly to need a parity rebuild after a reconstruction has completed.
If we've just reconstructed a disk, then the parity is known to
be correct. (XXX doesn't hold for RAID 6!)


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.27 26-Jan-2001 oster

branches: 1.27.2;
Ensure we update the 'partitionSize' field of the component labels
when doing a reconstruct or a copyback. If we don't, junk might be
there, and that could cause the component to be not correctly
autoconfigured on reboot. Thanks to Simon Burge for helping track this down.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.26 04-Jun-2000 oster

branches: 1.26.2;
Merge rf_update_component_labels() and rf_final_update_component_labels().


# 1.25 31-May-2000 oster

Oops.. reconstruction percentages were being reported incorrectly.
Thanks to Manuel Bouyer for noting this.


# 1.24 28-May-2000 oster

Umm.. Complete is not equal to 'left to do'. Fix the math.


# 1.23 28-May-2000 oster

- Add a mechanism for obtaining finer-grained 'progress' information
regarding reconstructs, copybacks, etc.

- RAID 0 doesn't do copybacks, but don't make raidctl sweat about it.


Revision tags: minoura-xpg4dl-base
# 1.22 13-Mar-2000 soren

branches: 1.22.2;
Fix doubled 'the's in comments.


# 1.21 07-Mar-2000 oster

Create a new rf_close_component() to handle vnode operations for closing
components. Teach rf_UnconfigureVnodes() how to use it, and tell
the copyback and reconstruction code about it too.


# 1.20 25-Feb-2000 oster

When we close autoconfigured components, we need to note that they
are no longer in 'autoconfigured' status.


# 1.19 25-Feb-2000 oster

Fix a (slightly) bogus status message.


# 1.18 24-Feb-2000 oster

Make sure we close auto-configured components appropriately when
attempting a rebuild-in-place.


# 1.17 23-Feb-2000 oster

Be more aggressive about updating component labels in the event
of a real component failure (or a simulated failure):
- add 'numNewFailures' to keep track of the number of disk failures
since mod_counter was last updated for each component label.
- make sure we call rf_update_component_labels() upon any component failure,
real or simulated.


# 1.16 23-Feb-2000 oster

Do a better job of (re)initializing the component labels after
a reconstruct or a copyback.


Revision tags: chs-ubc2-newbase
# 1.15 13-Feb-2000 oster

Get recent changes into the tree:
- make component_label variables more consistent (==> clabel)
- re-work incorrect component configuration code
- re-work disk configuration code
- cleanup initial configuration of raidPtr info
- add auto-detection of components and RAID sets (Disabled, for now)
- allow / on RAID sets (Disabled, for now)
- rename "config_disk_queue" to "rf_ConfigureDiskQueue" and properly prototype
in rf_diskqueue.h
- protect some headers with #if _KERNEL (XXX this needs to be fixed properly)
and cleanup header formatting.
- expand the component labels (yes, they should be backward/forward compatible)
- other bits and pieces (some function names are still bogus, and will get
changed soon)


# 1.14 09-Jan-2000 oster

Nuke dependencies on rf_cpuutils.h.


# 1.13 09-Jan-2000 oster

Nuke unused debugging stuff. Clean up a whole bunch of comments.


# 1.12 09-Jan-2000 oster

- move a bunch of function prototypes to rf_kintf.h
- general cleanup of a number of prototypes that were scattered around.


# 1.11 09-Jan-2000 oster

Nuke #if 0'ed code.


# 1.10 08-Jan-2000 oster

- nuke calls to rf_get_threadid() and associated #include
- change a bunch of debugging printfs from
"[%d] ...", tid (where tid is the "thread id")
to
"raid%d: ...", raidPtr->raidid
- other minor rototillage


# 1.9 05-Jan-2000 oster

- update RF_CREATE_THREAD to handle a 'process name' argument.
- fire up a new thread for parity re-writes, copybacks, and reconstructs.
The ioctl's which trigger these actions now return immediately.
- add progress accounting for the above actions.
- minor rototillage of rf_netbsdkintf.c to deal with all of the above.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base
# 1.8 14-Aug-1999 oster

branches: 1.8.2;
Remove a 'struct proc *'-passing abomination that's been bugging me
for quite some time.


# 1.7 13-Aug-1999 oster

rf_sys.h does not need to be #included in any of these files, and, actually,
is no longer needed at all.


# 1.6 13-Aug-1999 oster

Clean up reconstruction accounting a bit. While it worked before, it was
slightly broken in the case where the RAID set did not support reconstruction.


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 chs-ubc2-base netbsd-1-4-RELEASE netbsd-1-4-base
# 1.5 02-Mar-1999 oster

branches: 1.5.2;
Update for recent changes including component label support, clean
bits, rebuilding components in-place, adding hot spares, shutdownhooks, etc.


# 1.4 05-Feb-1999 oster

Phase 2 of the RAIDframe cleanup. The source is now closer to KNF
and is much easier to read. No functionality changes.


# 1.3 26-Jan-1999 oster

Nuke more bits of RAIDframe "demo" code. We're not "demoing" here,
we're doing the Real Thing!


# 1.2 26-Jan-1999 oster

RAIDframe cleanup, phase 1. Nuke simulator support, user-land driver,
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.

No functional changes to the kernel code in this commit.


Revision tags: kenh-if-detach-base
# 1.1 13-Nov-1998 oster

RAIDframe, version 1.1, from the Parallel Data Laboratory at
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.


# 1.126 23-Jul-2021 oster

Extensive mechanical changes to the pools used in RAIDframe.

Alloclist remains not per-RAID, so initialize that pool
separately/differently than the rest.

The remainder of pools in RF_Pools_s are now per-RAID pools. Mostly
mechanical changes to functions to allocate/destroy per-RAID pools.
Needed to make raidPtr available in certain cases to be able to find
the per-RAID pools.

Extend rf_pool_init() to now populate a per-RAID wchan value that is
unique to each pool for a given RAID device.

TODO: Complete the analysis of the minimum number of items that are
required for each pool to allow IO to progress (i.e. so that a request
for pool resources can always be satisfied), and dynamically scale
minimum pool sizes based on RAID configuration.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.125 15-Feb-2021 oster

Fix a long long-standing off-by-one error in computing lastPSID.

SUsPerPU is only really supported for a value of 1, and since the
first PSID is 0, the last will be numStripe-1. Also update the
setting of pending_writes to reflect the change to lastPSID.

Needs pullups to -8 and -9.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3 ad-namecache-base2 ad-namecache-base1 ad-namecache-base
# 1.124 08-Dec-2019 mlelstv

branches: 1.124.8;
Switch to vn_bdev_open* functions.


Revision tags: phil-wifi-20191119
# 1.123 10-Oct-2019 christos

fix the function pointer and callback mess:
- callback functions return 0 and their result is not checked; make them void.
- there are two types of callbacks and they used to overload their parameters
and the callback structure; separate them into "function" and "value"
callbacks.
- make the wait function signature consistent.


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.122 09-Feb-2019 christos

branches: 1.122.4;
- Change the allocation macros to be more like function calls
- Change sizeof(type) -> sizeof(*variable)
- Use macros for the long buffer length allocations
- Remove "bit polishing" memsets() -- do them only once
- Remove unnecessary casts

Thanks to oster@ for finding bugs and testing.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 pgoyette-compat-merge-20190127 pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 netbsd-8-0-RELEASE phil-wifi-base pgoyette-compat-0625 netbsd-8-0-RC2 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 netbsd-8-0-RC1 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226 nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base
# 1.121 14-Nov-2014 oster

branches: 1.121.12; 1.121.20;


Fix a long-standing bug related to rebooting while a
reconstruct-to-spare is underway but not yet complete.

The issue was that a component was being marked as a used_spare when
the rebuild started, not when the rebuild was actually finished.
Marking it as a used_spare meant that the component label on the spare
was being updated such that after a reboot the component would be
considered up-to-date, regardless of whether the rebuild actually
completed!

This fix includes:
1) Add an additional state "rf_ds_rebuilding_spare" which is used
to denote that a spare is currently being rebuilt from the live
components.
2) Update the comments on the disk states, which were out-of-sync
with reality.
3) When rebuilding to a spare component, that spare now enters the
state rf_ds_rebuilding_spare instead of the state rf_ds_used_spare.
4) When the rebuild is actually complete then the spare component
enters the rf_ds_used_spare state. rf_ds_used_spare is now used
exclusively for the case where the rebuilding to the spare has
completed successfully.

XXX: Someday we need to teach raidctl(8) about this new state, and
take out the backwards compatibility code in rf_netbsdkintf.c (see
RAIDFRAME_GET_INFO in raidioctl()). For today, this fix needs to be
generic enough that it can get backported without major grief.

XXX: Needs pullup to netbsd-5*, netbsd-6*, and netbsd-7

Fixes PR#49244.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.120 14-Jun-2014 hannken

branches: 1.120.2;
Change dk_lookup() to return an anonymous vnode not associated with
any file system. Change all consumers of dk_lookup() to get the
device from "v_rdev" instead of VOP_GETATTR() as specfs does not
support VOP_GETATTR(). Devices obtained with dk_lookup() will no
longer disappear on forced unmounts.

Fix for PR kern/48849 (root mirror raid fails on shutdown)

Welcome to 6.99.44


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base agc-symver-base
# 1.119 06-Mar-2013 yamt

branches: 1.119.10;
fix parens in a message


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.118 20-Feb-2012 oster

branches: 1.118.2;
Add logic to the main reconstruction loop to handle RAID5 with rotated
spares. While here, observe that we were actually doing one more
stripe than we thought we were, and correct that too (it didn't matter
for non-RAID5_RS, but it definitely does for RAID5_RS). Add some
bounds-checking at the beginning to handle the case where the number
of stripes in the set is smaller than the sliding reconstruction window.

XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.117 14-Oct-2011 hannken

branches: 1.117.2; 1.117.6; 1.117.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.116 03-Aug-2011 oster

Address part of PR kern/44972. From YAMAMOTO Takashi. Thanks!


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.115 28-May-2011 yamt

rf_ReconstructInPlace: don't leave a vnode open on errors.
fixes a part of PR/44972.


# 1.114 24-May-2011 buhrow

Suggested to oster@ and approved via private e-mail as a help to
people who are getting reconstruction failures.


# 1.113 11-May-2011 mrg

convert the main raidPtr mutex to a kmutex, and add a couple of cv's to
cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond.
convert all remaining simple_lock's to kmutexes (they're not used or compiled
right now... even with all options enabled) and remove the support for them.

this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs.


# 1.112 02-May-2011 mrg

convert rb_mutex to a kmutex/cv.


Revision tags: bouyer-quota2-nbase
# 1.111 19-Feb-2011 enami

Define accessors for number of blocks and partition size in the
component label and use them where appropriate. Disscussed on tech-kern.


Revision tags: bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.110 19-Nov-2010 dholland

branches: 1.110.2; 1.110.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4
# 1.109 01-Nov-2010 mrg

add support for >2TB raid devices.

- add two new members to the component label:
u_int numBlocksHi
u_int partitionSizeHi
and store the top 32 bits of the real number of blocks and
partition size. modify rf_print_component_label(),
rf_does_it_fit(), rf_AutoConfigureDisks() and
rf_ReconstructFailedDiskBasic().

- call disk_blocksize() after disk_attach() [ from mlelstv ]

- shift the block number relative to DEV_BSHIFT in raidstart()
and InitBP() so that accesses work for non 512-byte devices.
[ from mlelstv ]

- update rf_getdisksize() to use the new getdisksize() [ from
mlelstv. this part needs a separate change for netbsd-5. ]


reviewed by: oster, christos and darrenr


Revision tags: uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211
# 1.108 17-Nov-2009 jld

branches: 1.108.2; 1.108.4;
Finally commit the RAIDframe parity map Summer Of Code project.

Drastically reduces the amount of time spent rewriting parity after an
unclean shutdown by keeping better track of which regions might have had
outstanding writes. Enabled by default; can be disabled on a per-set
basis, or tuned, with the new raidctl(8) commands.

Discussed on tech-kern@ to a general air of approval; exhortations to
commit from mrg@, christos@, and others.

Thanks to Google for their sponsorship, oster@ for mentoring the
project, assorted developers for trying very hard to break it, and
probably more I'm forgetting.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base2 jym-xensuspend-base nick-hppapmap-base
# 1.107 11-Feb-2009 oster

If we see a RF_RECON_WRITE_ERROR event we know a write has finished and
we need to account for that. Failure to do so means we can end up
waiting forever for writes we think are outstanding, but which have
already completed.

Addresses the RAIDframe part of PR#40569. Thanks to Matthias Scheler
for reporting the issue and verifying the fix.


Revision tags: mjf-devfs2-base
# 1.106 20-Dec-2008 oster

branches: 1.106.2;
When unconfiguring an array where a reconstruct is in progress, abort
the reconstruct and wait for IOs to drain before pulling the plug.

Should fix the panic reported by der Mouse on tech-kern.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 haad-dm-base
# 1.105 23-Sep-2008 oster

branches: 1.105.2; 1.105.4;
Nuke unneeded printf(). Spotted by pooka@.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase wrstuden-revivesa-base
# 1.104 19-May-2008 oster

branches: 1.104.4;
Re-work some of the guts of the reconstruction code.

Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status pointers" are required for typical configurations. Convert
the reconstruction code to use a "sliding status window" which will
scale nicely regardless of the number of stripes/reconstruction units
in the RAID set. Convert the main reconstruction loop to rebuild the
array in chunks rather than in one big lump.

As part of these changes, introduce a function to kick any waiters on
the head separation callback list, and use that in the main
reconstruction event queue to wake up the waiters if things have
stalled. (I believe this may fix a race condition that could occur at
at least at the very end of a disk during reconstruction under heavy
IO load.)

Thanks to Brian Buhrow for all his help, support, and patience in
testing these changes.


Revision tags: yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.103 15-Apr-2008 oster

branches: 1.103.2; 1.103.4; 1.103.6;
A forced recon read should not default to indicating that the reads
for that disk have stopped, since this will bump us out of the normal
reconstruction loop prematurely.

Fixes the (mostly cosmetic) bug where the reconstruction
status values stop updating, and from raidctl it appears that
reconstruction has totally stalled (which it actually hasn't -- the
reconstruction does complete properly, but not in the normal way).


# 1.102 14-Apr-2008 oster

Print out the status value if a reconstruction read fails.
Don't print out write promotions during reconstruct unless
we are debugging reconstructs.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.101 26-Jan-2008 oster

branches: 1.101.6;
In a land before time, when kernel processes roamed the system, we
needed to keep track of the kernel process that opened a device in
order to close it with the right credentials. Flash forward to today
where curlwp is now quite sufficient.


Revision tags: bouyer-xeni386-merge1 vmlocking2-base3 bouyer-xeni386-nbase yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 bouyer-xeni386-base yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.100 26-Nov-2007 pooka

Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.99 21-Sep-2007 oster

branches: 1.99.6;
Fix wording in a comment and correct a debug line. From Olivier Cherrier
(via private mail). Thanks!


Revision tags: nick-csl-alignment-base5 matt-mips64-base
# 1.98 18-Jul-2007 ad

branches: 1.98.4; 1.98.6; 1.98.8;
Fix fallout from recent kthread changes.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.97 09-Jul-2007 ad

branches: 1.97.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.96 26-Jun-2007 cube

Change dk_lookup() to accept an additional argument of the type enum uio_seg
that tells whether the given path is in user space or kernel space, so it
can tell NDINIT().

While the raidframe calls were ok, both ccd(4) and cgd(4) were passing
pointers to user space data, which leads to strange error on i386, as
reported by Jukka Salmi on current-users.

The issue has been there since last august, I'm actually a bit surprised
that no one in the meantime has used ccd(4) or cgd(4) on an arch where it
would have simply faulted.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base ad-audiomp-base post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 newlock2-base netbsd-4-base
# 1.95 16-Nov-2006 christos

branches: 1.95.2; 1.95.8; 1.95.10; 1.95.16;
__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.94 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.93 27-Aug-2006 christos

branches: 1.93.2; 1.93.4;
- use dk_lookup instead of our home-spun version.
- allow raid to be configured in a wedge
- allow wedges to be configured in a raid
- add autoconfiguration of wedges in a raid


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.92 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.91 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.90 11-Dec-2005 christos

branches: 1.90.4; 1.90.6; 1.90.8; 1.90.10; 1.90.12;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.89 18-Jul-2005 oster

If rf_SubmitReconBuffer indicates the submission was blocked (for
whatever reason), return 0 instead of the default
RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result
in rf_ContinueReconstructFailedDisk() thinking that the given
component was "done" and breaking out of the main reconstruction loop
far too early. Reconstruction still worked correctly as long as there
were no errors, but RAIDframe wouldn't be in a position to properly
handle read/write errors during reconstruction.

This fixes the "raidctl's progress bar spins at 0% until
reconstruction finishes" problem.


# 1.88 08-Jun-2005 oster

branches: 1.88.2;
- initialize numRUsTotal before we indicate that we are doing a reconstruct.

- make numRUsComplete and numRUsTotal 64-bit quantities like
everything else that records this information.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.87 27-Feb-2005 perry

branches: 1.87.2;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.86 12-Feb-2005 oster

The 'next' argument to rf_CreateDiskQueueData is always NULL. Since
there is no particular reason to pass an extra NULL argument, turf it,
and initialize p->next to NULL within the function.


# 1.85 12-Feb-2005 oster

Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to
determine if we are willing to wait for memory to come from the
diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to
code calling rf_CreateDiskQueueData() with different expectations
(and/or blatent disregard) of what might happen if there were
insufficient pool resources.


# 1.84 06-Feb-2005 oster

It's not a bad idea to update the component labels whether or not the
reconstruction was successful.


# 1.83 05-Feb-2005 oster

rf_GetNextReconEvent() *will* return a valid event, so no need for
the assert. (we'd have panic'ed in there long before this assert
if that wasn't the case).

Minor whitespace changes.


# 1.82 05-Feb-2005 oster

Vastly improve the error handling in the case of a read/write error
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bailing and
leaving the system in the best, usable state possible.

- introduce rf_DrainReconEventQueue() to allow easy cleaning of the
reconstruction event queue

- change how we cleanup the floating recon buffers in
rf_FreeReconControl(). Detect the end of the list rather
than traversing according to a count.

- keep track of the number of pending reconstruction writes. In the
event of a read error, use this to wait long enough for the pending
writes to (hopefully) drain.

- more cleanup is still needed on this code, but I didn't want to
start mixing major functional changes with minor cleanups.

XXX: There is a known issue with pool items left outstanding due to
the IO failure, and this can show up in the form of a panic at the
tail end of a shutdown. This problem is much less severe than before
these changes, and the hope/plan is that this problem will go away
once this code gets overhauled again.


Revision tags: yamt-km-base
# 1.81 22-Jan-2005 oster

branches: 1.81.2;
Torch some #define's missed in last commit.


# 1.80 22-Jan-2005 oster

Reconstruction Descriptors are only allocated once per reconstruction,
and don't need their own pool or freelist or anything fancier than a
malloc/free.


# 1.79 18-Jan-2005 oster

ForceReconReadDoneProc() needs a return after doing the first
rf_CauseReconEvent().


Revision tags: kent-audio1-beforemerge
# 1.78 12-Dec-2004 oster

branches: 1.78.2;
The switch() in rf_ContinueReconstructFailedDisk() is never actually
used in non-simulation code, and thus is just wasting space (and
making the code more confusing to read!). Turf the switch, left-shift
the indentation of code, and nuke 'state' field of struct RF_RaidReconDesc_s.

No real functional changes.


Revision tags: kent-audio1-base
# 1.77 15-Nov-2004 oster

continueFunc and continueArg arn't used. Turf. Simplify calls to
rf_GetNextReconEvent().


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.76 18-Mar-2004 oster

branches: 1.76.4;
Re-work the locking mechanisms for reconstruct and PSS structures
such that we don't actually hold a simplelock while we are doing
a pool_get(), but that we still effectively protecting critical code.

This should fix all of the outstanding LOCKDEBUG warnings related to
rebuilding RAID sets.


# 1.75 13-Mar-2004 oster

- don't use rf_PrintUserStats() for recon statistics.
rf_PrintUserStats() was mean for the simulator, and doesn't provide
any real info in kernel-space, especially for reconstructs.
Reconstructing actually renders the stats even more useless, since it
resets them all to zero before the reconstruct starts!

- since rf_PrintUserStats() is no longer used, nuke it along with the
routines that feed it. Nothing was using this code, and if we ever
need it again, we know where to find it.


# 1.74 07-Mar-2004 oster

- Introduce rf_pools which contains all of the various global pools used
by RAIDframe. Convert all other RAIDframe global pools to use pools
defined within this new structure.
- Introduce rf_pool_init(), used for initializing a single pool in
RAIDframe. Teach each of the configuration routines to use
rf_pool_init().
- Cleanup a few pool-related comments.
- Cleanup revent initialization and #defines.
- Add a missing pool_destroy() for the reconbuffer pool.

(Saves another 1K off of an i386 GENERIC kernel, and makes
stuff a lot more readable)


# 1.73 07-Mar-2004 oster

- fix up initialization of rf_recond_pool
- introduce rf_reconbuffer_pool and teach rf_MakeReconBuffer() to use it


# 1.72 05-Mar-2004 oster

Use RF_INCLUDE_PARITY_DECLUSTERING_DS to #if-out more unneeded bits.
(We can't do RF_DISTRIBUTE_SPARE bits without the parity declustering stuff.)


# 1.71 03-Mar-2004 oster

Nuke some unnecessary casts. No functional changes.


# 1.70 03-Mar-2004 oster

Introduce RF_REVENT_READ_FAILED, RF_REVENT_WRITE_FAILED and RF_REVENT_FORCEREAD_FAILED.
This removes 3 more RF_PANIC()'s (but we'll currently still panic if any of these cases occur).
fix up a few printf's.
XXX: still needs more cleanup and testing (and be taught to not panic).


# 1.69 03-Mar-2004 oster

Cleanup function prototypes.


# 1.68 03-Mar-2004 oster

- cleanup memory allocation in rf_AllocPSStatus()
- change function signature of rf_LookupRUStatus(). The last argument
is now a pointer to a new PSS, in case one is needed. Rather than
having rf_LookupRUStatus() allocate a new PSS, we pre-allocate one
beforehand, where necessary, just in case.
- change callers of rf_lookupRUStatus() to deal with the new way of
calling rf_lookupRUStatus().

[no improvement or worsening of parity rebuild/initialization performance.]


# 1.67 01-Mar-2004 oster

Use RF_ACC_TRACE to #if out more chunks of code related only
to access tracing. (not turned on yet)


# 1.66 29-Feb-2004 oster

Adjust _rf_ShutdownCreate() so that it is willing to wait for more
memory. Since we only now ever "return(0)", just return (void)
instead.

Cleanup all uses of rf_ShutdownCreate() to not worry about
it ever failing. Shaves another 600 bytes off of an i386 GENERIC kernel.


# 1.65 04-Jan-2004 oster

raidPtr->reconControl->percentCompleted only gets used in one
debugging printf, and in rf_netbsdkintf.c. We can do the calculations
inside of RF_DEBUG_RECON for the one debugging printf, and only
perform the percentCompleted calculation "on demand" in the
rf_netbsdkintf.c case. Shaves a few more bytes off an i386 GENERIC
kernel, and ever-so-slightly decreases the amount of work performed
during a reconstruct.


# 1.64 31-Dec-2003 oster

Add in a bunch of RF_SIGNAL_COND()'s that were missing. Tidy up
a few lines.


# 1.63 31-Dec-2003 oster

Left-shift another else{} chunk. No functional changes.


# 1.62 31-Dec-2003 oster

left-shift the "else" part of the if(!lp_SubmitReconBuffer) condition.
Cleanup. No real functional changes, just more readable.


# 1.61 31-Dec-2003 oster

Negate a condition, and flip if/else parts. Preparation for left-shifting
the (now) else part. No real functional change.


# 1.60 30-Dec-2003 oster

Some days you wonder if some of the function declaration consistency
was just an accident in the first place. Cleanup function decls and
a few comments. [ok.. so I wasn't going to fix this many.. but once
you're on a roll....]


# 1.59 29-Dec-2003 oster

Let's see... raidPtr->recon_done_procs is never set to anything
(other than NULL when raidPtr is initialized). That means
SignalReconDone() never does anything useful. Bye-bye!

Say good-bye to recon_done_procs and recon_done_procs_mutex (and its
initializer) as well.


# 1.58 29-Dec-2003 oster

- first kick at a major reworking of RAIDframe's memory allocation code:
- all freelists converted to pools
- initialization of structure members in certain cases where
code was relying on specific allocation and usage properties
to keep structures in a "known state" (that doesn't work with
pools!).
- make most pool_get() be "PR_WAITOK" until they can be analyzed
further, and/or have proper error handling added.
- all RF_Mallocs zero the space returned, so there is no difference
between RF_Calloc and RF_Malloc. In fact, all the RF_Calloc()'s
do is tend to do is get things horribly confused.
Make RF_Malloc() the "general memory allocator", with
RF_MallocAndAdd() the "general memory allocator with
allocation list".
- some of these RF_Malloc's et al. are destined to disappear.
- remove rf_rdp_freelist entirely (it's not used anywhere!)
- remove: #include "rf_freelist.h"
- to the files that were relying on the above, add: #include "rf_general.h"
- add: #include "rf_debugMem.h" to rf_shutdown.h to make it happy
about the loss of: #include "rf_freelist.h".

This shrinks an i386 GENERIC kernel by approx 5K. RAIDframe now
weighs in at about 162K on i386.


# 1.57 29-Dec-2003 oster

[Having received a definite lack of strenuous objection, a small amount
of strenuous agreement, and some general agreement, this commit is
going ahead because it's now starting to block some other changes I
wish to make.]

Remove most of the support for the concept of "rows" from RAIDframe.
While the "row" interface has been exported to the world, RAIDframe
internals have really only supported a single row, even though they
have feigned support of multiple rows.

Nothing changes in configuration land -- config files still need to
specify a single row, etc. All auto-config structures remain fully
forward/backwards compatible.

The only visible difference to the average user should be a
reduction in the size of a GENERIC kernel (i386) by 4.5K. For those
of us trolling through RAIDframe kernel code, a lot of the driver
configuration code has become a LOT easier to read.


# 1.56 29-Jun-2003 fvdl

branches: 1.56.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.55 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.54 10-Apr-2003 simonb

Remove an assigned-to but unused variable.


# 1.53 21-Mar-2003 dsl

Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).


# 1.52 09-Feb-2003 jdolecek

constify some


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.51 19-Nov-2002 oster

For reconstructs, move checks for failed components to before the
kernel threads are created.


# 1.50 16-Nov-2002 oster

Cleanup more printfs.


# 1.49 15-Nov-2002 oster

After a rebuild-in-place, a reconstruct, or a copyback, we should
really be updating the component labels.


Revision tags: kqueue-aftermerge kqueue-beforemerge
# 1.48 18-Oct-2002 oster

Improve and/or re-arrange a number of locks. While much of the locking is
still a mess, and there are a number of unresolved issues here, this
gets us closer to being happier in LOCKDEBUG land.


# 1.47 06-Oct-2002 oster

Add a missing RF_LOCK_MUTEX().


# 1.46 06-Oct-2002 oster

Introduce a temp variable, and allocate the ReconCtrl structure before
we protect raidPtr. One less thing for LOCKDEBUG to complain about.


Revision tags: kqueue-base
# 1.45 23-Sep-2002 oster

Nuke "baddisk". Thanks to Simon B.


# 1.44 21-Sep-2002 oster

rf_RegisterReconDoneProc() isn't needed.

This is the last of the 'easy' ones that Krister made me aware of.
Total savings on i386 GENERIC kernel: 13151 bytes
RAIDframe in GENERIC is now at: 179033
Thanks again Krister!


# 1.43 19-Sep-2002 oster

Introduce and use RF_DEBUG_PSS, and save a few more bytes.


# 1.42 19-Sep-2002 oster

One signal will do, thanks.


# 1.41 17-Sep-2002 oster

Cast the RF_DEBUG_RECON net a little wider.


# 1.40 17-Sep-2002 oster

Rename RF_DEBUG_RECONBUFFER to RF_DEBUG_RECON in order to facilitate
disabling other stuff without having to introduce another #define.


# 1.39 16-Sep-2002 oster

Cleanup some comments.


# 1.38 16-Sep-2002 oster

rf_CheckFloatingRbufCount() is only really useful when debugging the
reconstruct buffer stuff. #if it out in the general case.


# 1.37 16-Sep-2002 oster

Cleanup some printf's, and disable some (debugging) output.


# 1.36 14-Sep-2002 oster

Everyone and their dog was using RF_ERRORMSG3 to print out the same
sort of error message, over and over again, in different files.
Rather than having the same text repeated in multiple .o files,
create a couple of little functions to do the printing, and save a
bundle of space. Also improves readability of code.


# 1.35 09-Sep-2002 oster

Disallow 'reconstruct-in-place' on a component that has failed
and has already been reconstructed to a hot spare.


Revision tags: gehenna-devsw-base
# 1.34 13-Jul-2002 oster

Nuke a redundant wakeup().


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.33 09-Jan-2002 oster

branches: 1.33.8;
Move a bunch of debugging stuff to be only used if DEBUG is turned on.


# 1.32 15-Nov-2001 lukem

don't need <sys/types.h> when including <sys/param.h>


# 1.31 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3
# 1.30 04-Oct-2001 oster

Step 2 of the disentanglement. We now look to <dev/raidframe/*> for
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.

No functional changes to RAIDframe.


Revision tags: thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.29 18-Jul-2001 thorpej

branches: 1.29.2;
bzero -> memset


# 1.28 14-Jun-2001 oster

branches: 1.28.2;
It's silly to need a parity rebuild after a reconstruction has completed.
If we've just reconstructed a disk, then the parity is known to
be correct. (XXX doesn't hold for RAID 6!)


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.27 26-Jan-2001 oster

branches: 1.27.2;
Ensure we update the 'partitionSize' field of the component labels
when doing a reconstruct or a copyback. If we don't, junk might be
there, and that could cause the component to be not correctly
autoconfigured on reboot. Thanks to Simon Burge for helping track this down.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.26 04-Jun-2000 oster

branches: 1.26.2;
Merge rf_update_component_labels() and rf_final_update_component_labels().


# 1.25 31-May-2000 oster

Oops.. reconstruction percentages were being reported incorrectly.
Thanks to Manuel Bouyer for noting this.


# 1.24 28-May-2000 oster

Umm.. Complete is not equal to 'left to do'. Fix the math.


# 1.23 28-May-2000 oster

- Add a mechanism for obtaining finer-grained 'progress' information
regarding reconstructs, copybacks, etc.

- RAID 0 doesn't do copybacks, but don't make raidctl sweat about it.


Revision tags: minoura-xpg4dl-base
# 1.22 13-Mar-2000 soren

branches: 1.22.2;
Fix doubled 'the's in comments.


# 1.21 07-Mar-2000 oster

Create a new rf_close_component() to handle vnode operations for closing
components. Teach rf_UnconfigureVnodes() how to use it, and tell
the copyback and reconstruction code about it too.


# 1.20 25-Feb-2000 oster

When we close autoconfigured components, we need to note that they
are no longer in 'autoconfigured' status.


# 1.19 25-Feb-2000 oster

Fix a (slightly) bogus status message.


# 1.18 24-Feb-2000 oster

Make sure we close auto-configured components appropriately when
attempting a rebuild-in-place.


# 1.17 23-Feb-2000 oster

Be more aggressive about updating component labels in the event
of a real component failure (or a simulated failure):
- add 'numNewFailures' to keep track of the number of disk failures
since mod_counter was last updated for each component label.
- make sure we call rf_update_component_labels() upon any component failure,
real or simulated.


# 1.16 23-Feb-2000 oster

Do a better job of (re)initializing the component labels after
a reconstruct or a copyback.


Revision tags: chs-ubc2-newbase
# 1.15 13-Feb-2000 oster

Get recent changes into the tree:
- make component_label variables more consistent (==> clabel)
- re-work incorrect component configuration code
- re-work disk configuration code
- cleanup initial configuration of raidPtr info
- add auto-detection of components and RAID sets (Disabled, for now)
- allow / on RAID sets (Disabled, for now)
- rename "config_disk_queue" to "rf_ConfigureDiskQueue" and properly prototype
in rf_diskqueue.h
- protect some headers with #if _KERNEL (XXX this needs to be fixed properly)
and cleanup header formatting.
- expand the component labels (yes, they should be backward/forward compatible)
- other bits and pieces (some function names are still bogus, and will get
changed soon)


# 1.14 09-Jan-2000 oster

Nuke dependencies on rf_cpuutils.h.


# 1.13 09-Jan-2000 oster

Nuke unused debugging stuff. Clean up a whole bunch of comments.


# 1.12 09-Jan-2000 oster

- move a bunch of function prototypes to rf_kintf.h
- general cleanup of a number of prototypes that were scattered around.


# 1.11 09-Jan-2000 oster

Nuke #if 0'ed code.


# 1.10 08-Jan-2000 oster

- nuke calls to rf_get_threadid() and associated #include
- change a bunch of debugging printfs from
"[%d] ...", tid (where tid is the "thread id")
to
"raid%d: ...", raidPtr->raidid
- other minor rototillage


# 1.9 05-Jan-2000 oster

- update RF_CREATE_THREAD to handle a 'process name' argument.
- fire up a new thread for parity re-writes, copybacks, and reconstructs.
The ioctl's which trigger these actions now return immediately.
- add progress accounting for the above actions.
- minor rototillage of rf_netbsdkintf.c to deal with all of the above.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base
# 1.8 14-Aug-1999 oster

branches: 1.8.2;
Remove a 'struct proc *'-passing abomination that's been bugging me
for quite some time.


# 1.7 13-Aug-1999 oster

rf_sys.h does not need to be #included in any of these files, and, actually,
is no longer needed at all.


# 1.6 13-Aug-1999 oster

Clean up reconstruction accounting a bit. While it worked before, it was
slightly broken in the case where the RAID set did not support reconstruction.


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 chs-ubc2-base netbsd-1-4-RELEASE netbsd-1-4-base
# 1.5 02-Mar-1999 oster

branches: 1.5.2;
Update for recent changes including component label support, clean
bits, rebuilding components in-place, adding hot spares, shutdownhooks, etc.


# 1.4 05-Feb-1999 oster

Phase 2 of the RAIDframe cleanup. The source is now closer to KNF
and is much easier to read. No functionality changes.


# 1.3 26-Jan-1999 oster

Nuke more bits of RAIDframe "demo" code. We're not "demoing" here,
we're doing the Real Thing!


# 1.2 26-Jan-1999 oster

RAIDframe cleanup, phase 1. Nuke simulator support, user-land driver,
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.

No functional changes to the kernel code in this commit.


Revision tags: kenh-if-detach-base
# 1.1 13-Nov-1998 oster

RAIDframe, version 1.1, from the Parallel Data Laboratory at
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.


# 1.125 15-Feb-2021 oster

Fix a long long-standing off-by-one error in computing lastPSID.

SUsPerPU is only really supported for a value of 1, and since the
first PSID is 0, the last will be numStripe-1. Also update the
setting of pending_writes to reflect the change to lastPSID.

Needs pullups to -8 and -9.


Revision tags: thorpej-futex-base bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3 ad-namecache-base2 ad-namecache-base1 ad-namecache-base
# 1.124 08-Dec-2019 mlelstv

Switch to vn_bdev_open* functions.


Revision tags: phil-wifi-20191119
# 1.123 10-Oct-2019 christos

fix the function pointer and callback mess:
- callback functions return 0 and their result is not checked; make them void.
- there are two types of callbacks and they used to overload their parameters
and the callback structure; separate them into "function" and "value"
callbacks.
- make the wait function signature consistent.


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.122 09-Feb-2019 christos

branches: 1.122.4;
- Change the allocation macros to be more like function calls
- Change sizeof(type) -> sizeof(*variable)
- Use macros for the long buffer length allocations
- Remove "bit polishing" memsets() -- do them only once
- Remove unnecessary casts

Thanks to oster@ for finding bugs and testing.


Revision tags: netbsd-8-2-RELEASE netbsd-8-1-RELEASE netbsd-8-1-RC1 pgoyette-compat-merge-20190127 pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 netbsd-8-0-RELEASE phil-wifi-base pgoyette-compat-0625 netbsd-8-0-RC2 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 netbsd-8-0-RC1 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226 nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base
# 1.121 14-Nov-2014 oster

branches: 1.121.12; 1.121.20;


Fix a long-standing bug related to rebooting while a
reconstruct-to-spare is underway but not yet complete.

The issue was that a component was being marked as a used_spare when
the rebuild started, not when the rebuild was actually finished.
Marking it as a used_spare meant that the component label on the spare
was being updated such that after a reboot the component would be
considered up-to-date, regardless of whether the rebuild actually
completed!

This fix includes:
1) Add an additional state "rf_ds_rebuilding_spare" which is used
to denote that a spare is currently being rebuilt from the live
components.
2) Update the comments on the disk states, which were out-of-sync
with reality.
3) When rebuilding to a spare component, that spare now enters the
state rf_ds_rebuilding_spare instead of the state rf_ds_used_spare.
4) When the rebuild is actually complete then the spare component
enters the rf_ds_used_spare state. rf_ds_used_spare is now used
exclusively for the case where the rebuilding to the spare has
completed successfully.

XXX: Someday we need to teach raidctl(8) about this new state, and
take out the backwards compatibility code in rf_netbsdkintf.c (see
RAIDFRAME_GET_INFO in raidioctl()). For today, this fix needs to be
generic enough that it can get backported without major grief.

XXX: Needs pullup to netbsd-5*, netbsd-6*, and netbsd-7

Fixes PR#49244.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.120 14-Jun-2014 hannken

branches: 1.120.2;
Change dk_lookup() to return an anonymous vnode not associated with
any file system. Change all consumers of dk_lookup() to get the
device from "v_rdev" instead of VOP_GETATTR() as specfs does not
support VOP_GETATTR(). Devices obtained with dk_lookup() will no
longer disappear on forced unmounts.

Fix for PR kern/48849 (root mirror raid fails on shutdown)

Welcome to 6.99.44


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base agc-symver-base
# 1.119 06-Mar-2013 yamt

branches: 1.119.10;
fix parens in a message


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.118 20-Feb-2012 oster

branches: 1.118.2;
Add logic to the main reconstruction loop to handle RAID5 with rotated
spares. While here, observe that we were actually doing one more
stripe than we thought we were, and correct that too (it didn't matter
for non-RAID5_RS, but it definitely does for RAID5_RS). Add some
bounds-checking at the beginning to handle the case where the number
of stripes in the set is smaller than the sliding reconstruction window.

XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.117 14-Oct-2011 hannken

branches: 1.117.2; 1.117.6; 1.117.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.116 03-Aug-2011 oster

Address part of PR kern/44972. From YAMAMOTO Takashi. Thanks!


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.115 28-May-2011 yamt

rf_ReconstructInPlace: don't leave a vnode open on errors.
fixes a part of PR/44972.


# 1.114 24-May-2011 buhrow

Suggested to oster@ and approved via private e-mail as a help to
people who are getting reconstruction failures.


# 1.113 11-May-2011 mrg

convert the main raidPtr mutex to a kmutex, and add a couple of cv's to
cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond.
convert all remaining simple_lock's to kmutexes (they're not used or compiled
right now... even with all options enabled) and remove the support for them.

this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs.


# 1.112 02-May-2011 mrg

convert rb_mutex to a kmutex/cv.


Revision tags: bouyer-quota2-nbase
# 1.111 19-Feb-2011 enami

Define accessors for number of blocks and partition size in the
component label and use them where appropriate. Disscussed on tech-kern.


Revision tags: bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.110 19-Nov-2010 dholland

branches: 1.110.2; 1.110.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4
# 1.109 01-Nov-2010 mrg

add support for >2TB raid devices.

- add two new members to the component label:
u_int numBlocksHi
u_int partitionSizeHi
and store the top 32 bits of the real number of blocks and
partition size. modify rf_print_component_label(),
rf_does_it_fit(), rf_AutoConfigureDisks() and
rf_ReconstructFailedDiskBasic().

- call disk_blocksize() after disk_attach() [ from mlelstv ]

- shift the block number relative to DEV_BSHIFT in raidstart()
and InitBP() so that accesses work for non 512-byte devices.
[ from mlelstv ]

- update rf_getdisksize() to use the new getdisksize() [ from
mlelstv. this part needs a separate change for netbsd-5. ]


reviewed by: oster, christos and darrenr


Revision tags: uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211
# 1.108 17-Nov-2009 jld

branches: 1.108.2; 1.108.4;
Finally commit the RAIDframe parity map Summer Of Code project.

Drastically reduces the amount of time spent rewriting parity after an
unclean shutdown by keeping better track of which regions might have had
outstanding writes. Enabled by default; can be disabled on a per-set
basis, or tuned, with the new raidctl(8) commands.

Discussed on tech-kern@ to a general air of approval; exhortations to
commit from mrg@, christos@, and others.

Thanks to Google for their sponsorship, oster@ for mentoring the
project, assorted developers for trying very hard to break it, and
probably more I'm forgetting.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base2 jym-xensuspend-base nick-hppapmap-base
# 1.107 11-Feb-2009 oster

If we see a RF_RECON_WRITE_ERROR event we know a write has finished and
we need to account for that. Failure to do so means we can end up
waiting forever for writes we think are outstanding, but which have
already completed.

Addresses the RAIDframe part of PR#40569. Thanks to Matthias Scheler
for reporting the issue and verifying the fix.


Revision tags: mjf-devfs2-base
# 1.106 20-Dec-2008 oster

branches: 1.106.2;
When unconfiguring an array where a reconstruct is in progress, abort
the reconstruct and wait for IOs to drain before pulling the plug.

Should fix the panic reported by der Mouse on tech-kern.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 haad-dm-base
# 1.105 23-Sep-2008 oster

branches: 1.105.2; 1.105.4;
Nuke unneeded printf(). Spotted by pooka@.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase wrstuden-revivesa-base
# 1.104 19-May-2008 oster

branches: 1.104.4;
Re-work some of the guts of the reconstruction code.

Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status pointers" are required for typical configurations. Convert
the reconstruction code to use a "sliding status window" which will
scale nicely regardless of the number of stripes/reconstruction units
in the RAID set. Convert the main reconstruction loop to rebuild the
array in chunks rather than in one big lump.

As part of these changes, introduce a function to kick any waiters on
the head separation callback list, and use that in the main
reconstruction event queue to wake up the waiters if things have
stalled. (I believe this may fix a race condition that could occur at
at least at the very end of a disk during reconstruction under heavy
IO load.)

Thanks to Brian Buhrow for all his help, support, and patience in
testing these changes.


Revision tags: yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.103 15-Apr-2008 oster

branches: 1.103.2; 1.103.4; 1.103.6;
A forced recon read should not default to indicating that the reads
for that disk have stopped, since this will bump us out of the normal
reconstruction loop prematurely.

Fixes the (mostly cosmetic) bug where the reconstruction
status values stop updating, and from raidctl it appears that
reconstruction has totally stalled (which it actually hasn't -- the
reconstruction does complete properly, but not in the normal way).


# 1.102 14-Apr-2008 oster

Print out the status value if a reconstruction read fails.
Don't print out write promotions during reconstruct unless
we are debugging reconstructs.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.101 26-Jan-2008 oster

branches: 1.101.6;
In a land before time, when kernel processes roamed the system, we
needed to keep track of the kernel process that opened a device in
order to close it with the right credentials. Flash forward to today
where curlwp is now quite sufficient.


Revision tags: bouyer-xeni386-merge1 vmlocking2-base3 bouyer-xeni386-nbase yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 bouyer-xeni386-base yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.100 26-Nov-2007 pooka

Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.99 21-Sep-2007 oster

branches: 1.99.6;
Fix wording in a comment and correct a debug line. From Olivier Cherrier
(via private mail). Thanks!


Revision tags: nick-csl-alignment-base5 matt-mips64-base
# 1.98 18-Jul-2007 ad

branches: 1.98.4; 1.98.6; 1.98.8;
Fix fallout from recent kthread changes.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.97 09-Jul-2007 ad

branches: 1.97.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.96 26-Jun-2007 cube

Change dk_lookup() to accept an additional argument of the type enum uio_seg
that tells whether the given path is in user space or kernel space, so it
can tell NDINIT().

While the raidframe calls were ok, both ccd(4) and cgd(4) were passing
pointers to user space data, which leads to strange error on i386, as
reported by Jukka Salmi on current-users.

The issue has been there since last august, I'm actually a bit surprised
that no one in the meantime has used ccd(4) or cgd(4) on an arch where it
would have simply faulted.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base ad-audiomp-base post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 newlock2-base netbsd-4-base
# 1.95 16-Nov-2006 christos

branches: 1.95.2; 1.95.8; 1.95.10; 1.95.16;
__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.94 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.93 27-Aug-2006 christos

branches: 1.93.2; 1.93.4;
- use dk_lookup instead of our home-spun version.
- allow raid to be configured in a wedge
- allow wedges to be configured in a raid
- add autoconfiguration of wedges in a raid


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.92 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.91 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.90 11-Dec-2005 christos

branches: 1.90.4; 1.90.6; 1.90.8; 1.90.10; 1.90.12;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.89 18-Jul-2005 oster

If rf_SubmitReconBuffer indicates the submission was blocked (for
whatever reason), return 0 instead of the default
RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result
in rf_ContinueReconstructFailedDisk() thinking that the given
component was "done" and breaking out of the main reconstruction loop
far too early. Reconstruction still worked correctly as long as there
were no errors, but RAIDframe wouldn't be in a position to properly
handle read/write errors during reconstruction.

This fixes the "raidctl's progress bar spins at 0% until
reconstruction finishes" problem.


# 1.88 08-Jun-2005 oster

branches: 1.88.2;
- initialize numRUsTotal before we indicate that we are doing a reconstruct.

- make numRUsComplete and numRUsTotal 64-bit quantities like
everything else that records this information.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.87 27-Feb-2005 perry

branches: 1.87.2;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.86 12-Feb-2005 oster

The 'next' argument to rf_CreateDiskQueueData is always NULL. Since
there is no particular reason to pass an extra NULL argument, turf it,
and initialize p->next to NULL within the function.


# 1.85 12-Feb-2005 oster

Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to
determine if we are willing to wait for memory to come from the
diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to
code calling rf_CreateDiskQueueData() with different expectations
(and/or blatent disregard) of what might happen if there were
insufficient pool resources.


# 1.84 06-Feb-2005 oster

It's not a bad idea to update the component labels whether or not the
reconstruction was successful.


# 1.83 05-Feb-2005 oster

rf_GetNextReconEvent() *will* return a valid event, so no need for
the assert. (we'd have panic'ed in there long before this assert
if that wasn't the case).

Minor whitespace changes.


# 1.82 05-Feb-2005 oster

Vastly improve the error handling in the case of a read/write error
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bailing and
leaving the system in the best, usable state possible.

- introduce rf_DrainReconEventQueue() to allow easy cleaning of the
reconstruction event queue

- change how we cleanup the floating recon buffers in
rf_FreeReconControl(). Detect the end of the list rather
than traversing according to a count.

- keep track of the number of pending reconstruction writes. In the
event of a read error, use this to wait long enough for the pending
writes to (hopefully) drain.

- more cleanup is still needed on this code, but I didn't want to
start mixing major functional changes with minor cleanups.

XXX: There is a known issue with pool items left outstanding due to
the IO failure, and this can show up in the form of a panic at the
tail end of a shutdown. This problem is much less severe than before
these changes, and the hope/plan is that this problem will go away
once this code gets overhauled again.


Revision tags: yamt-km-base
# 1.81 22-Jan-2005 oster

branches: 1.81.2;
Torch some #define's missed in last commit.


# 1.80 22-Jan-2005 oster

Reconstruction Descriptors are only allocated once per reconstruction,
and don't need their own pool or freelist or anything fancier than a
malloc/free.


# 1.79 18-Jan-2005 oster

ForceReconReadDoneProc() needs a return after doing the first
rf_CauseReconEvent().


Revision tags: kent-audio1-beforemerge
# 1.78 12-Dec-2004 oster

branches: 1.78.2;
The switch() in rf_ContinueReconstructFailedDisk() is never actually
used in non-simulation code, and thus is just wasting space (and
making the code more confusing to read!). Turf the switch, left-shift
the indentation of code, and nuke 'state' field of struct RF_RaidReconDesc_s.

No real functional changes.


Revision tags: kent-audio1-base
# 1.77 15-Nov-2004 oster

continueFunc and continueArg arn't used. Turf. Simplify calls to
rf_GetNextReconEvent().


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.76 18-Mar-2004 oster

branches: 1.76.4;
Re-work the locking mechanisms for reconstruct and PSS structures
such that we don't actually hold a simplelock while we are doing
a pool_get(), but that we still effectively protecting critical code.

This should fix all of the outstanding LOCKDEBUG warnings related to
rebuilding RAID sets.


# 1.75 13-Mar-2004 oster

- don't use rf_PrintUserStats() for recon statistics.
rf_PrintUserStats() was mean for the simulator, and doesn't provide
any real info in kernel-space, especially for reconstructs.
Reconstructing actually renders the stats even more useless, since it
resets them all to zero before the reconstruct starts!

- since rf_PrintUserStats() is no longer used, nuke it along with the
routines that feed it. Nothing was using this code, and if we ever
need it again, we know where to find it.


# 1.74 07-Mar-2004 oster

- Introduce rf_pools which contains all of the various global pools used
by RAIDframe. Convert all other RAIDframe global pools to use pools
defined within this new structure.
- Introduce rf_pool_init(), used for initializing a single pool in
RAIDframe. Teach each of the configuration routines to use
rf_pool_init().
- Cleanup a few pool-related comments.
- Cleanup revent initialization and #defines.
- Add a missing pool_destroy() for the reconbuffer pool.

(Saves another 1K off of an i386 GENERIC kernel, and makes
stuff a lot more readable)


# 1.73 07-Mar-2004 oster

- fix up initialization of rf_recond_pool
- introduce rf_reconbuffer_pool and teach rf_MakeReconBuffer() to use it


# 1.72 05-Mar-2004 oster

Use RF_INCLUDE_PARITY_DECLUSTERING_DS to #if-out more unneeded bits.
(We can't do RF_DISTRIBUTE_SPARE bits without the parity declustering stuff.)


# 1.71 03-Mar-2004 oster

Nuke some unnecessary casts. No functional changes.


# 1.70 03-Mar-2004 oster

Introduce RF_REVENT_READ_FAILED, RF_REVENT_WRITE_FAILED and RF_REVENT_FORCEREAD_FAILED.
This removes 3 more RF_PANIC()'s (but we'll currently still panic if any of these cases occur).
fix up a few printf's.
XXX: still needs more cleanup and testing (and be taught to not panic).


# 1.69 03-Mar-2004 oster

Cleanup function prototypes.


# 1.68 03-Mar-2004 oster

- cleanup memory allocation in rf_AllocPSStatus()
- change function signature of rf_LookupRUStatus(). The last argument
is now a pointer to a new PSS, in case one is needed. Rather than
having rf_LookupRUStatus() allocate a new PSS, we pre-allocate one
beforehand, where necessary, just in case.
- change callers of rf_lookupRUStatus() to deal with the new way of
calling rf_lookupRUStatus().

[no improvement or worsening of parity rebuild/initialization performance.]


# 1.67 01-Mar-2004 oster

Use RF_ACC_TRACE to #if out more chunks of code related only
to access tracing. (not turned on yet)


# 1.66 29-Feb-2004 oster

Adjust _rf_ShutdownCreate() so that it is willing to wait for more
memory. Since we only now ever "return(0)", just return (void)
instead.

Cleanup all uses of rf_ShutdownCreate() to not worry about
it ever failing. Shaves another 600 bytes off of an i386 GENERIC kernel.


# 1.65 04-Jan-2004 oster

raidPtr->reconControl->percentCompleted only gets used in one
debugging printf, and in rf_netbsdkintf.c. We can do the calculations
inside of RF_DEBUG_RECON for the one debugging printf, and only
perform the percentCompleted calculation "on demand" in the
rf_netbsdkintf.c case. Shaves a few more bytes off an i386 GENERIC
kernel, and ever-so-slightly decreases the amount of work performed
during a reconstruct.


# 1.64 31-Dec-2003 oster

Add in a bunch of RF_SIGNAL_COND()'s that were missing. Tidy up
a few lines.


# 1.63 31-Dec-2003 oster

Left-shift another else{} chunk. No functional changes.


# 1.62 31-Dec-2003 oster

left-shift the "else" part of the if(!lp_SubmitReconBuffer) condition.
Cleanup. No real functional changes, just more readable.


# 1.61 31-Dec-2003 oster

Negate a condition, and flip if/else parts. Preparation for left-shifting
the (now) else part. No real functional change.


# 1.60 30-Dec-2003 oster

Some days you wonder if some of the function declaration consistency
was just an accident in the first place. Cleanup function decls and
a few comments. [ok.. so I wasn't going to fix this many.. but once
you're on a roll....]


# 1.59 29-Dec-2003 oster

Let's see... raidPtr->recon_done_procs is never set to anything
(other than NULL when raidPtr is initialized). That means
SignalReconDone() never does anything useful. Bye-bye!

Say good-bye to recon_done_procs and recon_done_procs_mutex (and its
initializer) as well.


# 1.58 29-Dec-2003 oster

- first kick at a major reworking of RAIDframe's memory allocation code:
- all freelists converted to pools
- initialization of structure members in certain cases where
code was relying on specific allocation and usage properties
to keep structures in a "known state" (that doesn't work with
pools!).
- make most pool_get() be "PR_WAITOK" until they can be analyzed
further, and/or have proper error handling added.
- all RF_Mallocs zero the space returned, so there is no difference
between RF_Calloc and RF_Malloc. In fact, all the RF_Calloc()'s
do is tend to do is get things horribly confused.
Make RF_Malloc() the "general memory allocator", with
RF_MallocAndAdd() the "general memory allocator with
allocation list".
- some of these RF_Malloc's et al. are destined to disappear.
- remove rf_rdp_freelist entirely (it's not used anywhere!)
- remove: #include "rf_freelist.h"
- to the files that were relying on the above, add: #include "rf_general.h"
- add: #include "rf_debugMem.h" to rf_shutdown.h to make it happy
about the loss of: #include "rf_freelist.h".

This shrinks an i386 GENERIC kernel by approx 5K. RAIDframe now
weighs in at about 162K on i386.


# 1.57 29-Dec-2003 oster

[Having received a definite lack of strenuous objection, a small amount
of strenuous agreement, and some general agreement, this commit is
going ahead because it's now starting to block some other changes I
wish to make.]

Remove most of the support for the concept of "rows" from RAIDframe.
While the "row" interface has been exported to the world, RAIDframe
internals have really only supported a single row, even though they
have feigned support of multiple rows.

Nothing changes in configuration land -- config files still need to
specify a single row, etc. All auto-config structures remain fully
forward/backwards compatible.

The only visible difference to the average user should be a
reduction in the size of a GENERIC kernel (i386) by 4.5K. For those
of us trolling through RAIDframe kernel code, a lot of the driver
configuration code has become a LOT easier to read.


# 1.56 29-Jun-2003 fvdl

branches: 1.56.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.55 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.54 10-Apr-2003 simonb

Remove an assigned-to but unused variable.


# 1.53 21-Mar-2003 dsl

Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).


# 1.52 09-Feb-2003 jdolecek

constify some


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.51 19-Nov-2002 oster

For reconstructs, move checks for failed components to before the
kernel threads are created.


# 1.50 16-Nov-2002 oster

Cleanup more printfs.


# 1.49 15-Nov-2002 oster

After a rebuild-in-place, a reconstruct, or a copyback, we should
really be updating the component labels.


Revision tags: kqueue-aftermerge kqueue-beforemerge
# 1.48 18-Oct-2002 oster

Improve and/or re-arrange a number of locks. While much of the locking is
still a mess, and there are a number of unresolved issues here, this
gets us closer to being happier in LOCKDEBUG land.


# 1.47 06-Oct-2002 oster

Add a missing RF_LOCK_MUTEX().


# 1.46 06-Oct-2002 oster

Introduce a temp variable, and allocate the ReconCtrl structure before
we protect raidPtr. One less thing for LOCKDEBUG to complain about.


Revision tags: kqueue-base
# 1.45 23-Sep-2002 oster

Nuke "baddisk". Thanks to Simon B.


# 1.44 21-Sep-2002 oster

rf_RegisterReconDoneProc() isn't needed.

This is the last of the 'easy' ones that Krister made me aware of.
Total savings on i386 GENERIC kernel: 13151 bytes
RAIDframe in GENERIC is now at: 179033
Thanks again Krister!


# 1.43 19-Sep-2002 oster

Introduce and use RF_DEBUG_PSS, and save a few more bytes.


# 1.42 19-Sep-2002 oster

One signal will do, thanks.


# 1.41 17-Sep-2002 oster

Cast the RF_DEBUG_RECON net a little wider.


# 1.40 17-Sep-2002 oster

Rename RF_DEBUG_RECONBUFFER to RF_DEBUG_RECON in order to facilitate
disabling other stuff without having to introduce another #define.


# 1.39 16-Sep-2002 oster

Cleanup some comments.


# 1.38 16-Sep-2002 oster

rf_CheckFloatingRbufCount() is only really useful when debugging the
reconstruct buffer stuff. #if it out in the general case.


# 1.37 16-Sep-2002 oster

Cleanup some printf's, and disable some (debugging) output.


# 1.36 14-Sep-2002 oster

Everyone and their dog was using RF_ERRORMSG3 to print out the same
sort of error message, over and over again, in different files.
Rather than having the same text repeated in multiple .o files,
create a couple of little functions to do the printing, and save a
bundle of space. Also improves readability of code.


# 1.35 09-Sep-2002 oster

Disallow 'reconstruct-in-place' on a component that has failed
and has already been reconstructed to a hot spare.


Revision tags: gehenna-devsw-base
# 1.34 13-Jul-2002 oster

Nuke a redundant wakeup().


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.33 09-Jan-2002 oster

branches: 1.33.8;
Move a bunch of debugging stuff to be only used if DEBUG is turned on.


# 1.32 15-Nov-2001 lukem

don't need <sys/types.h> when including <sys/param.h>


# 1.31 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3
# 1.30 04-Oct-2001 oster

Step 2 of the disentanglement. We now look to <dev/raidframe/*> for
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.

No functional changes to RAIDframe.


Revision tags: thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.29 18-Jul-2001 thorpej

branches: 1.29.2;
bzero -> memset


# 1.28 14-Jun-2001 oster

branches: 1.28.2;
It's silly to need a parity rebuild after a reconstruction has completed.
If we've just reconstructed a disk, then the parity is known to
be correct. (XXX doesn't hold for RAID 6!)


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.27 26-Jan-2001 oster

branches: 1.27.2;
Ensure we update the 'partitionSize' field of the component labels
when doing a reconstruct or a copyback. If we don't, junk might be
there, and that could cause the component to be not correctly
autoconfigured on reboot. Thanks to Simon Burge for helping track this down.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.26 04-Jun-2000 oster

branches: 1.26.2;
Merge rf_update_component_labels() and rf_final_update_component_labels().


# 1.25 31-May-2000 oster

Oops.. reconstruction percentages were being reported incorrectly.
Thanks to Manuel Bouyer for noting this.


# 1.24 28-May-2000 oster

Umm.. Complete is not equal to 'left to do'. Fix the math.


# 1.23 28-May-2000 oster

- Add a mechanism for obtaining finer-grained 'progress' information
regarding reconstructs, copybacks, etc.

- RAID 0 doesn't do copybacks, but don't make raidctl sweat about it.


Revision tags: minoura-xpg4dl-base
# 1.22 13-Mar-2000 soren

branches: 1.22.2;
Fix doubled 'the's in comments.


# 1.21 07-Mar-2000 oster

Create a new rf_close_component() to handle vnode operations for closing
components. Teach rf_UnconfigureVnodes() how to use it, and tell
the copyback and reconstruction code about it too.


# 1.20 25-Feb-2000 oster

When we close autoconfigured components, we need to note that they
are no longer in 'autoconfigured' status.


# 1.19 25-Feb-2000 oster

Fix a (slightly) bogus status message.


# 1.18 24-Feb-2000 oster

Make sure we close auto-configured components appropriately when
attempting a rebuild-in-place.


# 1.17 23-Feb-2000 oster

Be more aggressive about updating component labels in the event
of a real component failure (or a simulated failure):
- add 'numNewFailures' to keep track of the number of disk failures
since mod_counter was last updated for each component label.
- make sure we call rf_update_component_labels() upon any component failure,
real or simulated.


# 1.16 23-Feb-2000 oster

Do a better job of (re)initializing the component labels after
a reconstruct or a copyback.


Revision tags: chs-ubc2-newbase
# 1.15 13-Feb-2000 oster

Get recent changes into the tree:
- make component_label variables more consistent (==> clabel)
- re-work incorrect component configuration code
- re-work disk configuration code
- cleanup initial configuration of raidPtr info
- add auto-detection of components and RAID sets (Disabled, for now)
- allow / on RAID sets (Disabled, for now)
- rename "config_disk_queue" to "rf_ConfigureDiskQueue" and properly prototype
in rf_diskqueue.h
- protect some headers with #if _KERNEL (XXX this needs to be fixed properly)
and cleanup header formatting.
- expand the component labels (yes, they should be backward/forward compatible)
- other bits and pieces (some function names are still bogus, and will get
changed soon)


# 1.14 09-Jan-2000 oster

Nuke dependencies on rf_cpuutils.h.


# 1.13 09-Jan-2000 oster

Nuke unused debugging stuff. Clean up a whole bunch of comments.


# 1.12 09-Jan-2000 oster

- move a bunch of function prototypes to rf_kintf.h
- general cleanup of a number of prototypes that were scattered around.


# 1.11 09-Jan-2000 oster

Nuke #if 0'ed code.


# 1.10 08-Jan-2000 oster

- nuke calls to rf_get_threadid() and associated #include
- change a bunch of debugging printfs from
"[%d] ...", tid (where tid is the "thread id")
to
"raid%d: ...", raidPtr->raidid
- other minor rototillage


# 1.9 05-Jan-2000 oster

- update RF_CREATE_THREAD to handle a 'process name' argument.
- fire up a new thread for parity re-writes, copybacks, and reconstructs.
The ioctl's which trigger these actions now return immediately.
- add progress accounting for the above actions.
- minor rototillage of rf_netbsdkintf.c to deal with all of the above.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base
# 1.8 14-Aug-1999 oster

branches: 1.8.2;
Remove a 'struct proc *'-passing abomination that's been bugging me
for quite some time.


# 1.7 13-Aug-1999 oster

rf_sys.h does not need to be #included in any of these files, and, actually,
is no longer needed at all.


# 1.6 13-Aug-1999 oster

Clean up reconstruction accounting a bit. While it worked before, it was
slightly broken in the case where the RAID set did not support reconstruction.


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 chs-ubc2-base netbsd-1-4-RELEASE netbsd-1-4-base
# 1.5 02-Mar-1999 oster

branches: 1.5.2;
Update for recent changes including component label support, clean
bits, rebuilding components in-place, adding hot spares, shutdownhooks, etc.


# 1.4 05-Feb-1999 oster

Phase 2 of the RAIDframe cleanup. The source is now closer to KNF
and is much easier to read. No functionality changes.


# 1.3 26-Jan-1999 oster

Nuke more bits of RAIDframe "demo" code. We're not "demoing" here,
we're doing the Real Thing!


# 1.2 26-Jan-1999 oster

RAIDframe cleanup, phase 1. Nuke simulator support, user-land driver,
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.

No functional changes to the kernel code in this commit.


Revision tags: kenh-if-detach-base
# 1.1 13-Nov-1998 oster

RAIDframe, version 1.1, from the Parallel Data Laboratory at
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.


# 1.124 08-Dec-2019 mlelstv

Switch to vn_bdev_open* functions.


Revision tags: phil-wifi-20191119
# 1.123 10-Oct-2019 christos

fix the function pointer and callback mess:
- callback functions return 0 and their result is not checked; make them void.
- there are two types of callbacks and they used to overload their parameters
and the callback structure; separate them into "function" and "value"
callbacks.
- make the wait function signature consistent.


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.122 09-Feb-2019 christos

- Change the allocation macros to be more like function calls
- Change sizeof(type) -> sizeof(*variable)
- Use macros for the long buffer length allocations
- Remove "bit polishing" memsets() -- do them only once
- Remove unnecessary casts

Thanks to oster@ for finding bugs and testing.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 pgoyette-compat-merge-20190127 pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 netbsd-8-0-RELEASE phil-wifi-base pgoyette-compat-0625 netbsd-8-0-RC2 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 netbsd-8-0-RC1 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226 nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base
# 1.121 14-Nov-2014 oster

branches: 1.121.20;


Fix a long-standing bug related to rebooting while a
reconstruct-to-spare is underway but not yet complete.

The issue was that a component was being marked as a used_spare when
the rebuild started, not when the rebuild was actually finished.
Marking it as a used_spare meant that the component label on the spare
was being updated such that after a reboot the component would be
considered up-to-date, regardless of whether the rebuild actually
completed!

This fix includes:
1) Add an additional state "rf_ds_rebuilding_spare" which is used
to denote that a spare is currently being rebuilt from the live
components.
2) Update the comments on the disk states, which were out-of-sync
with reality.
3) When rebuilding to a spare component, that spare now enters the
state rf_ds_rebuilding_spare instead of the state rf_ds_used_spare.
4) When the rebuild is actually complete then the spare component
enters the rf_ds_used_spare state. rf_ds_used_spare is now used
exclusively for the case where the rebuilding to the spare has
completed successfully.

XXX: Someday we need to teach raidctl(8) about this new state, and
take out the backwards compatibility code in rf_netbsdkintf.c (see
RAIDFRAME_GET_INFO in raidioctl()). For today, this fix needs to be
generic enough that it can get backported without major grief.

XXX: Needs pullup to netbsd-5*, netbsd-6*, and netbsd-7

Fixes PR#49244.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.120 14-Jun-2014 hannken

branches: 1.120.2;
Change dk_lookup() to return an anonymous vnode not associated with
any file system. Change all consumers of dk_lookup() to get the
device from "v_rdev" instead of VOP_GETATTR() as specfs does not
support VOP_GETATTR(). Devices obtained with dk_lookup() will no
longer disappear on forced unmounts.

Fix for PR kern/48849 (root mirror raid fails on shutdown)

Welcome to 6.99.44


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base agc-symver-base
# 1.119 06-Mar-2013 yamt

branches: 1.119.10;
fix parens in a message


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.118 20-Feb-2012 oster

branches: 1.118.2;
Add logic to the main reconstruction loop to handle RAID5 with rotated
spares. While here, observe that we were actually doing one more
stripe than we thought we were, and correct that too (it didn't matter
for non-RAID5_RS, but it definitely does for RAID5_RS). Add some
bounds-checking at the beginning to handle the case where the number
of stripes in the set is smaller than the sliding reconstruction window.

XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.117 14-Oct-2011 hannken

branches: 1.117.2; 1.117.6; 1.117.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.116 03-Aug-2011 oster

Address part of PR kern/44972. From YAMAMOTO Takashi. Thanks!


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.115 28-May-2011 yamt

rf_ReconstructInPlace: don't leave a vnode open on errors.
fixes a part of PR/44972.


# 1.114 24-May-2011 buhrow

Suggested to oster@ and approved via private e-mail as a help to
people who are getting reconstruction failures.


# 1.113 11-May-2011 mrg

convert the main raidPtr mutex to a kmutex, and add a couple of cv's to
cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond.
convert all remaining simple_lock's to kmutexes (they're not used or compiled
right now... even with all options enabled) and remove the support for them.

this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs.


# 1.112 02-May-2011 mrg

convert rb_mutex to a kmutex/cv.


Revision tags: bouyer-quota2-nbase
# 1.111 19-Feb-2011 enami

Define accessors for number of blocks and partition size in the
component label and use them where appropriate. Disscussed on tech-kern.


Revision tags: bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.110 19-Nov-2010 dholland

branches: 1.110.2; 1.110.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4
# 1.109 01-Nov-2010 mrg

add support for >2TB raid devices.

- add two new members to the component label:
u_int numBlocksHi
u_int partitionSizeHi
and store the top 32 bits of the real number of blocks and
partition size. modify rf_print_component_label(),
rf_does_it_fit(), rf_AutoConfigureDisks() and
rf_ReconstructFailedDiskBasic().

- call disk_blocksize() after disk_attach() [ from mlelstv ]

- shift the block number relative to DEV_BSHIFT in raidstart()
and InitBP() so that accesses work for non 512-byte devices.
[ from mlelstv ]

- update rf_getdisksize() to use the new getdisksize() [ from
mlelstv. this part needs a separate change for netbsd-5. ]


reviewed by: oster, christos and darrenr


Revision tags: uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211
# 1.108 17-Nov-2009 jld

branches: 1.108.2; 1.108.4;
Finally commit the RAIDframe parity map Summer Of Code project.

Drastically reduces the amount of time spent rewriting parity after an
unclean shutdown by keeping better track of which regions might have had
outstanding writes. Enabled by default; can be disabled on a per-set
basis, or tuned, with the new raidctl(8) commands.

Discussed on tech-kern@ to a general air of approval; exhortations to
commit from mrg@, christos@, and others.

Thanks to Google for their sponsorship, oster@ for mentoring the
project, assorted developers for trying very hard to break it, and
probably more I'm forgetting.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base2 jym-xensuspend-base nick-hppapmap-base
# 1.107 11-Feb-2009 oster

If we see a RF_RECON_WRITE_ERROR event we know a write has finished and
we need to account for that. Failure to do so means we can end up
waiting forever for writes we think are outstanding, but which have
already completed.

Addresses the RAIDframe part of PR#40569. Thanks to Matthias Scheler
for reporting the issue and verifying the fix.


Revision tags: mjf-devfs2-base
# 1.106 20-Dec-2008 oster

branches: 1.106.2;
When unconfiguring an array where a reconstruct is in progress, abort
the reconstruct and wait for IOs to drain before pulling the plug.

Should fix the panic reported by der Mouse on tech-kern.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 haad-dm-base
# 1.105 23-Sep-2008 oster

branches: 1.105.2; 1.105.4;
Nuke unneeded printf(). Spotted by pooka@.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase wrstuden-revivesa-base
# 1.104 19-May-2008 oster

branches: 1.104.4;
Re-work some of the guts of the reconstruction code.

Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status pointers" are required for typical configurations. Convert
the reconstruction code to use a "sliding status window" which will
scale nicely regardless of the number of stripes/reconstruction units
in the RAID set. Convert the main reconstruction loop to rebuild the
array in chunks rather than in one big lump.

As part of these changes, introduce a function to kick any waiters on
the head separation callback list, and use that in the main
reconstruction event queue to wake up the waiters if things have
stalled. (I believe this may fix a race condition that could occur at
at least at the very end of a disk during reconstruction under heavy
IO load.)

Thanks to Brian Buhrow for all his help, support, and patience in
testing these changes.


Revision tags: yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.103 15-Apr-2008 oster

branches: 1.103.2; 1.103.4; 1.103.6;
A forced recon read should not default to indicating that the reads
for that disk have stopped, since this will bump us out of the normal
reconstruction loop prematurely.

Fixes the (mostly cosmetic) bug where the reconstruction
status values stop updating, and from raidctl it appears that
reconstruction has totally stalled (which it actually hasn't -- the
reconstruction does complete properly, but not in the normal way).


# 1.102 14-Apr-2008 oster

Print out the status value if a reconstruction read fails.
Don't print out write promotions during reconstruct unless
we are debugging reconstructs.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.101 26-Jan-2008 oster

branches: 1.101.6;
In a land before time, when kernel processes roamed the system, we
needed to keep track of the kernel process that opened a device in
order to close it with the right credentials. Flash forward to today
where curlwp is now quite sufficient.


Revision tags: bouyer-xeni386-merge1 vmlocking2-base3 bouyer-xeni386-nbase yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 bouyer-xeni386-base yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.100 26-Nov-2007 pooka

Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.99 21-Sep-2007 oster

branches: 1.99.6;
Fix wording in a comment and correct a debug line. From Olivier Cherrier
(via private mail). Thanks!


Revision tags: nick-csl-alignment-base5 matt-mips64-base
# 1.98 18-Jul-2007 ad

branches: 1.98.4; 1.98.6; 1.98.8;
Fix fallout from recent kthread changes.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.97 09-Jul-2007 ad

branches: 1.97.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.96 26-Jun-2007 cube

Change dk_lookup() to accept an additional argument of the type enum uio_seg
that tells whether the given path is in user space or kernel space, so it
can tell NDINIT().

While the raidframe calls were ok, both ccd(4) and cgd(4) were passing
pointers to user space data, which leads to strange error on i386, as
reported by Jukka Salmi on current-users.

The issue has been there since last august, I'm actually a bit surprised
that no one in the meantime has used ccd(4) or cgd(4) on an arch where it
would have simply faulted.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base ad-audiomp-base post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 newlock2-base netbsd-4-base
# 1.95 16-Nov-2006 christos

branches: 1.95.2; 1.95.8; 1.95.10; 1.95.16;
__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.94 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.93 27-Aug-2006 christos

branches: 1.93.2; 1.93.4;
- use dk_lookup instead of our home-spun version.
- allow raid to be configured in a wedge
- allow wedges to be configured in a raid
- add autoconfiguration of wedges in a raid


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.92 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.91 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.90 11-Dec-2005 christos

branches: 1.90.4; 1.90.6; 1.90.8; 1.90.10; 1.90.12;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.89 18-Jul-2005 oster

If rf_SubmitReconBuffer indicates the submission was blocked (for
whatever reason), return 0 instead of the default
RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result
in rf_ContinueReconstructFailedDisk() thinking that the given
component was "done" and breaking out of the main reconstruction loop
far too early. Reconstruction still worked correctly as long as there
were no errors, but RAIDframe wouldn't be in a position to properly
handle read/write errors during reconstruction.

This fixes the "raidctl's progress bar spins at 0% until
reconstruction finishes" problem.


# 1.88 08-Jun-2005 oster

branches: 1.88.2;
- initialize numRUsTotal before we indicate that we are doing a reconstruct.

- make numRUsComplete and numRUsTotal 64-bit quantities like
everything else that records this information.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.87 27-Feb-2005 perry

branches: 1.87.2;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.86 12-Feb-2005 oster

The 'next' argument to rf_CreateDiskQueueData is always NULL. Since
there is no particular reason to pass an extra NULL argument, turf it,
and initialize p->next to NULL within the function.


# 1.85 12-Feb-2005 oster

Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to
determine if we are willing to wait for memory to come from the
diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to
code calling rf_CreateDiskQueueData() with different expectations
(and/or blatent disregard) of what might happen if there were
insufficient pool resources.


# 1.84 06-Feb-2005 oster

It's not a bad idea to update the component labels whether or not the
reconstruction was successful.


# 1.83 05-Feb-2005 oster

rf_GetNextReconEvent() *will* return a valid event, so no need for
the assert. (we'd have panic'ed in there long before this assert
if that wasn't the case).

Minor whitespace changes.


# 1.82 05-Feb-2005 oster

Vastly improve the error handling in the case of a read/write error
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bailing and
leaving the system in the best, usable state possible.

- introduce rf_DrainReconEventQueue() to allow easy cleaning of the
reconstruction event queue

- change how we cleanup the floating recon buffers in
rf_FreeReconControl(). Detect the end of the list rather
than traversing according to a count.

- keep track of the number of pending reconstruction writes. In the
event of a read error, use this to wait long enough for the pending
writes to (hopefully) drain.

- more cleanup is still needed on this code, but I didn't want to
start mixing major functional changes with minor cleanups.

XXX: There is a known issue with pool items left outstanding due to
the IO failure, and this can show up in the form of a panic at the
tail end of a shutdown. This problem is much less severe than before
these changes, and the hope/plan is that this problem will go away
once this code gets overhauled again.


Revision tags: yamt-km-base
# 1.81 22-Jan-2005 oster

branches: 1.81.2;
Torch some #define's missed in last commit.


# 1.80 22-Jan-2005 oster

Reconstruction Descriptors are only allocated once per reconstruction,
and don't need their own pool or freelist or anything fancier than a
malloc/free.


# 1.79 18-Jan-2005 oster

ForceReconReadDoneProc() needs a return after doing the first
rf_CauseReconEvent().


Revision tags: kent-audio1-beforemerge
# 1.78 12-Dec-2004 oster

branches: 1.78.2;
The switch() in rf_ContinueReconstructFailedDisk() is never actually
used in non-simulation code, and thus is just wasting space (and
making the code more confusing to read!). Turf the switch, left-shift
the indentation of code, and nuke 'state' field of struct RF_RaidReconDesc_s.

No real functional changes.


Revision tags: kent-audio1-base
# 1.77 15-Nov-2004 oster

continueFunc and continueArg arn't used. Turf. Simplify calls to
rf_GetNextReconEvent().


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.76 18-Mar-2004 oster

branches: 1.76.4;
Re-work the locking mechanisms for reconstruct and PSS structures
such that we don't actually hold a simplelock while we are doing
a pool_get(), but that we still effectively protecting critical code.

This should fix all of the outstanding LOCKDEBUG warnings related to
rebuilding RAID sets.


# 1.75 13-Mar-2004 oster

- don't use rf_PrintUserStats() for recon statistics.
rf_PrintUserStats() was mean for the simulator, and doesn't provide
any real info in kernel-space, especially for reconstructs.
Reconstructing actually renders the stats even more useless, since it
resets them all to zero before the reconstruct starts!

- since rf_PrintUserStats() is no longer used, nuke it along with the
routines that feed it. Nothing was using this code, and if we ever
need it again, we know where to find it.


# 1.74 07-Mar-2004 oster

- Introduce rf_pools which contains all of the various global pools used
by RAIDframe. Convert all other RAIDframe global pools to use pools
defined within this new structure.
- Introduce rf_pool_init(), used for initializing a single pool in
RAIDframe. Teach each of the configuration routines to use
rf_pool_init().
- Cleanup a few pool-related comments.
- Cleanup revent initialization and #defines.
- Add a missing pool_destroy() for the reconbuffer pool.

(Saves another 1K off of an i386 GENERIC kernel, and makes
stuff a lot more readable)


# 1.73 07-Mar-2004 oster

- fix up initialization of rf_recond_pool
- introduce rf_reconbuffer_pool and teach rf_MakeReconBuffer() to use it


# 1.72 05-Mar-2004 oster

Use RF_INCLUDE_PARITY_DECLUSTERING_DS to #if-out more unneeded bits.
(We can't do RF_DISTRIBUTE_SPARE bits without the parity declustering stuff.)


# 1.71 03-Mar-2004 oster

Nuke some unnecessary casts. No functional changes.


# 1.70 03-Mar-2004 oster

Introduce RF_REVENT_READ_FAILED, RF_REVENT_WRITE_FAILED and RF_REVENT_FORCEREAD_FAILED.
This removes 3 more RF_PANIC()'s (but we'll currently still panic if any of these cases occur).
fix up a few printf's.
XXX: still needs more cleanup and testing (and be taught to not panic).


# 1.69 03-Mar-2004 oster

Cleanup function prototypes.


# 1.68 03-Mar-2004 oster

- cleanup memory allocation in rf_AllocPSStatus()
- change function signature of rf_LookupRUStatus(). The last argument
is now a pointer to a new PSS, in case one is needed. Rather than
having rf_LookupRUStatus() allocate a new PSS, we pre-allocate one
beforehand, where necessary, just in case.
- change callers of rf_lookupRUStatus() to deal with the new way of
calling rf_lookupRUStatus().

[no improvement or worsening of parity rebuild/initialization performance.]


# 1.67 01-Mar-2004 oster

Use RF_ACC_TRACE to #if out more chunks of code related only
to access tracing. (not turned on yet)


# 1.66 29-Feb-2004 oster

Adjust _rf_ShutdownCreate() so that it is willing to wait for more
memory. Since we only now ever "return(0)", just return (void)
instead.

Cleanup all uses of rf_ShutdownCreate() to not worry about
it ever failing. Shaves another 600 bytes off of an i386 GENERIC kernel.


# 1.65 04-Jan-2004 oster

raidPtr->reconControl->percentCompleted only gets used in one
debugging printf, and in rf_netbsdkintf.c. We can do the calculations
inside of RF_DEBUG_RECON for the one debugging printf, and only
perform the percentCompleted calculation "on demand" in the
rf_netbsdkintf.c case. Shaves a few more bytes off an i386 GENERIC
kernel, and ever-so-slightly decreases the amount of work performed
during a reconstruct.


# 1.64 31-Dec-2003 oster

Add in a bunch of RF_SIGNAL_COND()'s that were missing. Tidy up
a few lines.


# 1.63 31-Dec-2003 oster

Left-shift another else{} chunk. No functional changes.


# 1.62 31-Dec-2003 oster

left-shift the "else" part of the if(!lp_SubmitReconBuffer) condition.
Cleanup. No real functional changes, just more readable.


# 1.61 31-Dec-2003 oster

Negate a condition, and flip if/else parts. Preparation for left-shifting
the (now) else part. No real functional change.


# 1.60 30-Dec-2003 oster

Some days you wonder if some of the function declaration consistency
was just an accident in the first place. Cleanup function decls and
a few comments. [ok.. so I wasn't going to fix this many.. but once
you're on a roll....]


# 1.59 29-Dec-2003 oster

Let's see... raidPtr->recon_done_procs is never set to anything
(other than NULL when raidPtr is initialized). That means
SignalReconDone() never does anything useful. Bye-bye!

Say good-bye to recon_done_procs and recon_done_procs_mutex (and its
initializer) as well.


# 1.58 29-Dec-2003 oster

- first kick at a major reworking of RAIDframe's memory allocation code:
- all freelists converted to pools
- initialization of structure members in certain cases where
code was relying on specific allocation and usage properties
to keep structures in a "known state" (that doesn't work with
pools!).
- make most pool_get() be "PR_WAITOK" until they can be analyzed
further, and/or have proper error handling added.
- all RF_Mallocs zero the space returned, so there is no difference
between RF_Calloc and RF_Malloc. In fact, all the RF_Calloc()'s
do is tend to do is get things horribly confused.
Make RF_Malloc() the "general memory allocator", with
RF_MallocAndAdd() the "general memory allocator with
allocation list".
- some of these RF_Malloc's et al. are destined to disappear.
- remove rf_rdp_freelist entirely (it's not used anywhere!)
- remove: #include "rf_freelist.h"
- to the files that were relying on the above, add: #include "rf_general.h"
- add: #include "rf_debugMem.h" to rf_shutdown.h to make it happy
about the loss of: #include "rf_freelist.h".

This shrinks an i386 GENERIC kernel by approx 5K. RAIDframe now
weighs in at about 162K on i386.


# 1.57 29-Dec-2003 oster

[Having received a definite lack of strenuous objection, a small amount
of strenuous agreement, and some general agreement, this commit is
going ahead because it's now starting to block some other changes I
wish to make.]

Remove most of the support for the concept of "rows" from RAIDframe.
While the "row" interface has been exported to the world, RAIDframe
internals have really only supported a single row, even though they
have feigned support of multiple rows.

Nothing changes in configuration land -- config files still need to
specify a single row, etc. All auto-config structures remain fully
forward/backwards compatible.

The only visible difference to the average user should be a
reduction in the size of a GENERIC kernel (i386) by 4.5K. For those
of us trolling through RAIDframe kernel code, a lot of the driver
configuration code has become a LOT easier to read.


# 1.56 29-Jun-2003 fvdl

branches: 1.56.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.55 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.54 10-Apr-2003 simonb

Remove an assigned-to but unused variable.


# 1.53 21-Mar-2003 dsl

Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).


# 1.52 09-Feb-2003 jdolecek

constify some


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.51 19-Nov-2002 oster

For reconstructs, move checks for failed components to before the
kernel threads are created.


# 1.50 16-Nov-2002 oster

Cleanup more printfs.


# 1.49 15-Nov-2002 oster

After a rebuild-in-place, a reconstruct, or a copyback, we should
really be updating the component labels.


Revision tags: kqueue-aftermerge kqueue-beforemerge
# 1.48 18-Oct-2002 oster

Improve and/or re-arrange a number of locks. While much of the locking is
still a mess, and there are a number of unresolved issues here, this
gets us closer to being happier in LOCKDEBUG land.


# 1.47 06-Oct-2002 oster

Add a missing RF_LOCK_MUTEX().


# 1.46 06-Oct-2002 oster

Introduce a temp variable, and allocate the ReconCtrl structure before
we protect raidPtr. One less thing for LOCKDEBUG to complain about.


Revision tags: kqueue-base
# 1.45 23-Sep-2002 oster

Nuke "baddisk". Thanks to Simon B.


# 1.44 21-Sep-2002 oster

rf_RegisterReconDoneProc() isn't needed.

This is the last of the 'easy' ones that Krister made me aware of.
Total savings on i386 GENERIC kernel: 13151 bytes
RAIDframe in GENERIC is now at: 179033
Thanks again Krister!


# 1.43 19-Sep-2002 oster

Introduce and use RF_DEBUG_PSS, and save a few more bytes.


# 1.42 19-Sep-2002 oster

One signal will do, thanks.


# 1.41 17-Sep-2002 oster

Cast the RF_DEBUG_RECON net a little wider.


# 1.40 17-Sep-2002 oster

Rename RF_DEBUG_RECONBUFFER to RF_DEBUG_RECON in order to facilitate
disabling other stuff without having to introduce another #define.


# 1.39 16-Sep-2002 oster

Cleanup some comments.


# 1.38 16-Sep-2002 oster

rf_CheckFloatingRbufCount() is only really useful when debugging the
reconstruct buffer stuff. #if it out in the general case.


# 1.37 16-Sep-2002 oster

Cleanup some printf's, and disable some (debugging) output.


# 1.36 14-Sep-2002 oster

Everyone and their dog was using RF_ERRORMSG3 to print out the same
sort of error message, over and over again, in different files.
Rather than having the same text repeated in multiple .o files,
create a couple of little functions to do the printing, and save a
bundle of space. Also improves readability of code.


# 1.35 09-Sep-2002 oster

Disallow 'reconstruct-in-place' on a component that has failed
and has already been reconstructed to a hot spare.


Revision tags: gehenna-devsw-base
# 1.34 13-Jul-2002 oster

Nuke a redundant wakeup().


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.33 09-Jan-2002 oster

branches: 1.33.8;
Move a bunch of debugging stuff to be only used if DEBUG is turned on.


# 1.32 15-Nov-2001 lukem

don't need <sys/types.h> when including <sys/param.h>


# 1.31 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3
# 1.30 04-Oct-2001 oster

Step 2 of the disentanglement. We now look to <dev/raidframe/*> for
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.

No functional changes to RAIDframe.


Revision tags: thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.29 18-Jul-2001 thorpej

branches: 1.29.2;
bzero -> memset


# 1.28 14-Jun-2001 oster

branches: 1.28.2;
It's silly to need a parity rebuild after a reconstruction has completed.
If we've just reconstructed a disk, then the parity is known to
be correct. (XXX doesn't hold for RAID 6!)


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.27 26-Jan-2001 oster

branches: 1.27.2;
Ensure we update the 'partitionSize' field of the component labels
when doing a reconstruct or a copyback. If we don't, junk might be
there, and that could cause the component to be not correctly
autoconfigured on reboot. Thanks to Simon Burge for helping track this down.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.26 04-Jun-2000 oster

branches: 1.26.2;
Merge rf_update_component_labels() and rf_final_update_component_labels().


# 1.25 31-May-2000 oster

Oops.. reconstruction percentages were being reported incorrectly.
Thanks to Manuel Bouyer for noting this.


# 1.24 28-May-2000 oster

Umm.. Complete is not equal to 'left to do'. Fix the math.


# 1.23 28-May-2000 oster

- Add a mechanism for obtaining finer-grained 'progress' information
regarding reconstructs, copybacks, etc.

- RAID 0 doesn't do copybacks, but don't make raidctl sweat about it.


Revision tags: minoura-xpg4dl-base
# 1.22 13-Mar-2000 soren

branches: 1.22.2;
Fix doubled 'the's in comments.


# 1.21 07-Mar-2000 oster

Create a new rf_close_component() to handle vnode operations for closing
components. Teach rf_UnconfigureVnodes() how to use it, and tell
the copyback and reconstruction code about it too.


# 1.20 25-Feb-2000 oster

When we close autoconfigured components, we need to note that they
are no longer in 'autoconfigured' status.


# 1.19 25-Feb-2000 oster

Fix a (slightly) bogus status message.


# 1.18 24-Feb-2000 oster

Make sure we close auto-configured components appropriately when
attempting a rebuild-in-place.


# 1.17 23-Feb-2000 oster

Be more aggressive about updating component labels in the event
of a real component failure (or a simulated failure):
- add 'numNewFailures' to keep track of the number of disk failures
since mod_counter was last updated for each component label.
- make sure we call rf_update_component_labels() upon any component failure,
real or simulated.


# 1.16 23-Feb-2000 oster

Do a better job of (re)initializing the component labels after
a reconstruct or a copyback.


Revision tags: chs-ubc2-newbase
# 1.15 13-Feb-2000 oster

Get recent changes into the tree:
- make component_label variables more consistent (==> clabel)
- re-work incorrect component configuration code
- re-work disk configuration code
- cleanup initial configuration of raidPtr info
- add auto-detection of components and RAID sets (Disabled, for now)
- allow / on RAID sets (Disabled, for now)
- rename "config_disk_queue" to "rf_ConfigureDiskQueue" and properly prototype
in rf_diskqueue.h
- protect some headers with #if _KERNEL (XXX this needs to be fixed properly)
and cleanup header formatting.
- expand the component labels (yes, they should be backward/forward compatible)
- other bits and pieces (some function names are still bogus, and will get
changed soon)


# 1.14 09-Jan-2000 oster

Nuke dependencies on rf_cpuutils.h.


# 1.13 09-Jan-2000 oster

Nuke unused debugging stuff. Clean up a whole bunch of comments.


# 1.12 09-Jan-2000 oster

- move a bunch of function prototypes to rf_kintf.h
- general cleanup of a number of prototypes that were scattered around.


# 1.11 09-Jan-2000 oster

Nuke #if 0'ed code.


# 1.10 08-Jan-2000 oster

- nuke calls to rf_get_threadid() and associated #include
- change a bunch of debugging printfs from
"[%d] ...", tid (where tid is the "thread id")
to
"raid%d: ...", raidPtr->raidid
- other minor rototillage


# 1.9 05-Jan-2000 oster

- update RF_CREATE_THREAD to handle a 'process name' argument.
- fire up a new thread for parity re-writes, copybacks, and reconstructs.
The ioctl's which trigger these actions now return immediately.
- add progress accounting for the above actions.
- minor rototillage of rf_netbsdkintf.c to deal with all of the above.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base
# 1.8 14-Aug-1999 oster

branches: 1.8.2;
Remove a 'struct proc *'-passing abomination that's been bugging me
for quite some time.


# 1.7 13-Aug-1999 oster

rf_sys.h does not need to be #included in any of these files, and, actually,
is no longer needed at all.


# 1.6 13-Aug-1999 oster

Clean up reconstruction accounting a bit. While it worked before, it was
slightly broken in the case where the RAID set did not support reconstruction.


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 chs-ubc2-base netbsd-1-4-RELEASE netbsd-1-4-base
# 1.5 02-Mar-1999 oster

branches: 1.5.2;
Update for recent changes including component label support, clean
bits, rebuilding components in-place, adding hot spares, shutdownhooks, etc.


# 1.4 05-Feb-1999 oster

Phase 2 of the RAIDframe cleanup. The source is now closer to KNF
and is much easier to read. No functionality changes.


# 1.3 26-Jan-1999 oster

Nuke more bits of RAIDframe "demo" code. We're not "demoing" here,
we're doing the Real Thing!


# 1.2 26-Jan-1999 oster

RAIDframe cleanup, phase 1. Nuke simulator support, user-land driver,
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.

No functional changes to the kernel code in this commit.


Revision tags: kenh-if-detach-base
# 1.1 13-Nov-1998 oster

RAIDframe, version 1.1, from the Parallel Data Laboratory at
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.


# 1.123 10-Oct-2019 christos

fix the function pointer and callback mess:
- callback functions return 0 and their result is not checked; make them void.
- there are two types of callbacks and they used to overload their parameters
and the callback structure; separate them into "function" and "value"
callbacks.
- make the wait function signature consistent.


Revision tags: netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.122 09-Feb-2019 christos

- Change the allocation macros to be more like function calls
- Change sizeof(type) -> sizeof(*variable)
- Use macros for the long buffer length allocations
- Remove "bit polishing" memsets() -- do them only once
- Remove unnecessary casts

Thanks to oster@ for finding bugs and testing.


Revision tags: netbsd-8-1-RELEASE netbsd-8-1-RC1 pgoyette-compat-merge-20190127 pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 netbsd-8-0-RELEASE phil-wifi-base pgoyette-compat-0625 netbsd-8-0-RC2 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 netbsd-8-0-RC1 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226 nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base
# 1.121 14-Nov-2014 oster

branches: 1.121.20;


Fix a long-standing bug related to rebooting while a
reconstruct-to-spare is underway but not yet complete.

The issue was that a component was being marked as a used_spare when
the rebuild started, not when the rebuild was actually finished.
Marking it as a used_spare meant that the component label on the spare
was being updated such that after a reboot the component would be
considered up-to-date, regardless of whether the rebuild actually
completed!

This fix includes:
1) Add an additional state "rf_ds_rebuilding_spare" which is used
to denote that a spare is currently being rebuilt from the live
components.
2) Update the comments on the disk states, which were out-of-sync
with reality.
3) When rebuilding to a spare component, that spare now enters the
state rf_ds_rebuilding_spare instead of the state rf_ds_used_spare.
4) When the rebuild is actually complete then the spare component
enters the rf_ds_used_spare state. rf_ds_used_spare is now used
exclusively for the case where the rebuilding to the spare has
completed successfully.

XXX: Someday we need to teach raidctl(8) about this new state, and
take out the backwards compatibility code in rf_netbsdkintf.c (see
RAIDFRAME_GET_INFO in raidioctl()). For today, this fix needs to be
generic enough that it can get backported without major grief.

XXX: Needs pullup to netbsd-5*, netbsd-6*, and netbsd-7

Fixes PR#49244.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.120 14-Jun-2014 hannken

branches: 1.120.2;
Change dk_lookup() to return an anonymous vnode not associated with
any file system. Change all consumers of dk_lookup() to get the
device from "v_rdev" instead of VOP_GETATTR() as specfs does not
support VOP_GETATTR(). Devices obtained with dk_lookup() will no
longer disappear on forced unmounts.

Fix for PR kern/48849 (root mirror raid fails on shutdown)

Welcome to 6.99.44


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base agc-symver-base
# 1.119 06-Mar-2013 yamt

branches: 1.119.10;
fix parens in a message


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.118 20-Feb-2012 oster

branches: 1.118.2;
Add logic to the main reconstruction loop to handle RAID5 with rotated
spares. While here, observe that we were actually doing one more
stripe than we thought we were, and correct that too (it didn't matter
for non-RAID5_RS, but it definitely does for RAID5_RS). Add some
bounds-checking at the beginning to handle the case where the number
of stripes in the set is smaller than the sliding reconstruction window.

XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.117 14-Oct-2011 hannken

branches: 1.117.2; 1.117.6; 1.117.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.116 03-Aug-2011 oster

Address part of PR kern/44972. From YAMAMOTO Takashi. Thanks!


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.115 28-May-2011 yamt

rf_ReconstructInPlace: don't leave a vnode open on errors.
fixes a part of PR/44972.


# 1.114 24-May-2011 buhrow

Suggested to oster@ and approved via private e-mail as a help to
people who are getting reconstruction failures.


# 1.113 11-May-2011 mrg

convert the main raidPtr mutex to a kmutex, and add a couple of cv's to
cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond.
convert all remaining simple_lock's to kmutexes (they're not used or compiled
right now... even with all options enabled) and remove the support for them.

this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs.


# 1.112 02-May-2011 mrg

convert rb_mutex to a kmutex/cv.


Revision tags: bouyer-quota2-nbase
# 1.111 19-Feb-2011 enami

Define accessors for number of blocks and partition size in the
component label and use them where appropriate. Disscussed on tech-kern.


Revision tags: bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.110 19-Nov-2010 dholland

branches: 1.110.2; 1.110.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4
# 1.109 01-Nov-2010 mrg

add support for >2TB raid devices.

- add two new members to the component label:
u_int numBlocksHi
u_int partitionSizeHi
and store the top 32 bits of the real number of blocks and
partition size. modify rf_print_component_label(),
rf_does_it_fit(), rf_AutoConfigureDisks() and
rf_ReconstructFailedDiskBasic().

- call disk_blocksize() after disk_attach() [ from mlelstv ]

- shift the block number relative to DEV_BSHIFT in raidstart()
and InitBP() so that accesses work for non 512-byte devices.
[ from mlelstv ]

- update rf_getdisksize() to use the new getdisksize() [ from
mlelstv. this part needs a separate change for netbsd-5. ]


reviewed by: oster, christos and darrenr


Revision tags: uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211
# 1.108 17-Nov-2009 jld

branches: 1.108.2; 1.108.4;
Finally commit the RAIDframe parity map Summer Of Code project.

Drastically reduces the amount of time spent rewriting parity after an
unclean shutdown by keeping better track of which regions might have had
outstanding writes. Enabled by default; can be disabled on a per-set
basis, or tuned, with the new raidctl(8) commands.

Discussed on tech-kern@ to a general air of approval; exhortations to
commit from mrg@, christos@, and others.

Thanks to Google for their sponsorship, oster@ for mentoring the
project, assorted developers for trying very hard to break it, and
probably more I'm forgetting.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base2 jym-xensuspend-base nick-hppapmap-base
# 1.107 11-Feb-2009 oster

If we see a RF_RECON_WRITE_ERROR event we know a write has finished and
we need to account for that. Failure to do so means we can end up
waiting forever for writes we think are outstanding, but which have
already completed.

Addresses the RAIDframe part of PR#40569. Thanks to Matthias Scheler
for reporting the issue and verifying the fix.


Revision tags: mjf-devfs2-base
# 1.106 20-Dec-2008 oster

branches: 1.106.2;
When unconfiguring an array where a reconstruct is in progress, abort
the reconstruct and wait for IOs to drain before pulling the plug.

Should fix the panic reported by der Mouse on tech-kern.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 haad-dm-base
# 1.105 23-Sep-2008 oster

branches: 1.105.2; 1.105.4;
Nuke unneeded printf(). Spotted by pooka@.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase wrstuden-revivesa-base
# 1.104 19-May-2008 oster

branches: 1.104.4;
Re-work some of the guts of the reconstruction code.

Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status pointers" are required for typical configurations. Convert
the reconstruction code to use a "sliding status window" which will
scale nicely regardless of the number of stripes/reconstruction units
in the RAID set. Convert the main reconstruction loop to rebuild the
array in chunks rather than in one big lump.

As part of these changes, introduce a function to kick any waiters on
the head separation callback list, and use that in the main
reconstruction event queue to wake up the waiters if things have
stalled. (I believe this may fix a race condition that could occur at
at least at the very end of a disk during reconstruction under heavy
IO load.)

Thanks to Brian Buhrow for all his help, support, and patience in
testing these changes.


Revision tags: yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.103 15-Apr-2008 oster

branches: 1.103.2; 1.103.4; 1.103.6;
A forced recon read should not default to indicating that the reads
for that disk have stopped, since this will bump us out of the normal
reconstruction loop prematurely.

Fixes the (mostly cosmetic) bug where the reconstruction
status values stop updating, and from raidctl it appears that
reconstruction has totally stalled (which it actually hasn't -- the
reconstruction does complete properly, but not in the normal way).


# 1.102 14-Apr-2008 oster

Print out the status value if a reconstruction read fails.
Don't print out write promotions during reconstruct unless
we are debugging reconstructs.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.101 26-Jan-2008 oster

branches: 1.101.6;
In a land before time, when kernel processes roamed the system, we
needed to keep track of the kernel process that opened a device in
order to close it with the right credentials. Flash forward to today
where curlwp is now quite sufficient.


Revision tags: bouyer-xeni386-merge1 vmlocking2-base3 bouyer-xeni386-nbase yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 bouyer-xeni386-base yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.100 26-Nov-2007 pooka

Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.99 21-Sep-2007 oster

branches: 1.99.6;
Fix wording in a comment and correct a debug line. From Olivier Cherrier
(via private mail). Thanks!


Revision tags: nick-csl-alignment-base5 matt-mips64-base
# 1.98 18-Jul-2007 ad

branches: 1.98.4; 1.98.6; 1.98.8;
Fix fallout from recent kthread changes.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.97 09-Jul-2007 ad

branches: 1.97.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.96 26-Jun-2007 cube

Change dk_lookup() to accept an additional argument of the type enum uio_seg
that tells whether the given path is in user space or kernel space, so it
can tell NDINIT().

While the raidframe calls were ok, both ccd(4) and cgd(4) were passing
pointers to user space data, which leads to strange error on i386, as
reported by Jukka Salmi on current-users.

The issue has been there since last august, I'm actually a bit surprised
that no one in the meantime has used ccd(4) or cgd(4) on an arch where it
would have simply faulted.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base ad-audiomp-base post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 newlock2-base netbsd-4-base
# 1.95 16-Nov-2006 christos

branches: 1.95.2; 1.95.8; 1.95.10; 1.95.16;
__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.94 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.93 27-Aug-2006 christos

branches: 1.93.2; 1.93.4;
- use dk_lookup instead of our home-spun version.
- allow raid to be configured in a wedge
- allow wedges to be configured in a raid
- add autoconfiguration of wedges in a raid


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.92 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.91 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.90 11-Dec-2005 christos

branches: 1.90.4; 1.90.6; 1.90.8; 1.90.10; 1.90.12;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.89 18-Jul-2005 oster

If rf_SubmitReconBuffer indicates the submission was blocked (for
whatever reason), return 0 instead of the default
RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result
in rf_ContinueReconstructFailedDisk() thinking that the given
component was "done" and breaking out of the main reconstruction loop
far too early. Reconstruction still worked correctly as long as there
were no errors, but RAIDframe wouldn't be in a position to properly
handle read/write errors during reconstruction.

This fixes the "raidctl's progress bar spins at 0% until
reconstruction finishes" problem.


# 1.88 08-Jun-2005 oster

branches: 1.88.2;
- initialize numRUsTotal before we indicate that we are doing a reconstruct.

- make numRUsComplete and numRUsTotal 64-bit quantities like
everything else that records this information.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.87 27-Feb-2005 perry

branches: 1.87.2;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.86 12-Feb-2005 oster

The 'next' argument to rf_CreateDiskQueueData is always NULL. Since
there is no particular reason to pass an extra NULL argument, turf it,
and initialize p->next to NULL within the function.


# 1.85 12-Feb-2005 oster

Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to
determine if we are willing to wait for memory to come from the
diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to
code calling rf_CreateDiskQueueData() with different expectations
(and/or blatent disregard) of what might happen if there were
insufficient pool resources.


# 1.84 06-Feb-2005 oster

It's not a bad idea to update the component labels whether or not the
reconstruction was successful.


# 1.83 05-Feb-2005 oster

rf_GetNextReconEvent() *will* return a valid event, so no need for
the assert. (we'd have panic'ed in there long before this assert
if that wasn't the case).

Minor whitespace changes.


# 1.82 05-Feb-2005 oster

Vastly improve the error handling in the case of a read/write error
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bailing and
leaving the system in the best, usable state possible.

- introduce rf_DrainReconEventQueue() to allow easy cleaning of the
reconstruction event queue

- change how we cleanup the floating recon buffers in
rf_FreeReconControl(). Detect the end of the list rather
than traversing according to a count.

- keep track of the number of pending reconstruction writes. In the
event of a read error, use this to wait long enough for the pending
writes to (hopefully) drain.

- more cleanup is still needed on this code, but I didn't want to
start mixing major functional changes with minor cleanups.

XXX: There is a known issue with pool items left outstanding due to
the IO failure, and this can show up in the form of a panic at the
tail end of a shutdown. This problem is much less severe than before
these changes, and the hope/plan is that this problem will go away
once this code gets overhauled again.


Revision tags: yamt-km-base
# 1.81 22-Jan-2005 oster

branches: 1.81.2;
Torch some #define's missed in last commit.


# 1.80 22-Jan-2005 oster

Reconstruction Descriptors are only allocated once per reconstruction,
and don't need their own pool or freelist or anything fancier than a
malloc/free.


# 1.79 18-Jan-2005 oster

ForceReconReadDoneProc() needs a return after doing the first
rf_CauseReconEvent().


Revision tags: kent-audio1-beforemerge
# 1.78 12-Dec-2004 oster

branches: 1.78.2;
The switch() in rf_ContinueReconstructFailedDisk() is never actually
used in non-simulation code, and thus is just wasting space (and
making the code more confusing to read!). Turf the switch, left-shift
the indentation of code, and nuke 'state' field of struct RF_RaidReconDesc_s.

No real functional changes.


Revision tags: kent-audio1-base
# 1.77 15-Nov-2004 oster

continueFunc and continueArg arn't used. Turf. Simplify calls to
rf_GetNextReconEvent().


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.76 18-Mar-2004 oster

branches: 1.76.4;
Re-work the locking mechanisms for reconstruct and PSS structures
such that we don't actually hold a simplelock while we are doing
a pool_get(), but that we still effectively protecting critical code.

This should fix all of the outstanding LOCKDEBUG warnings related to
rebuilding RAID sets.


# 1.75 13-Mar-2004 oster

- don't use rf_PrintUserStats() for recon statistics.
rf_PrintUserStats() was mean for the simulator, and doesn't provide
any real info in kernel-space, especially for reconstructs.
Reconstructing actually renders the stats even more useless, since it
resets them all to zero before the reconstruct starts!

- since rf_PrintUserStats() is no longer used, nuke it along with the
routines that feed it. Nothing was using this code, and if we ever
need it again, we know where to find it.


# 1.74 07-Mar-2004 oster

- Introduce rf_pools which contains all of the various global pools used
by RAIDframe. Convert all other RAIDframe global pools to use pools
defined within this new structure.
- Introduce rf_pool_init(), used for initializing a single pool in
RAIDframe. Teach each of the configuration routines to use
rf_pool_init().
- Cleanup a few pool-related comments.
- Cleanup revent initialization and #defines.
- Add a missing pool_destroy() for the reconbuffer pool.

(Saves another 1K off of an i386 GENERIC kernel, and makes
stuff a lot more readable)


# 1.73 07-Mar-2004 oster

- fix up initialization of rf_recond_pool
- introduce rf_reconbuffer_pool and teach rf_MakeReconBuffer() to use it


# 1.72 05-Mar-2004 oster

Use RF_INCLUDE_PARITY_DECLUSTERING_DS to #if-out more unneeded bits.
(We can't do RF_DISTRIBUTE_SPARE bits without the parity declustering stuff.)


# 1.71 03-Mar-2004 oster

Nuke some unnecessary casts. No functional changes.


# 1.70 03-Mar-2004 oster

Introduce RF_REVENT_READ_FAILED, RF_REVENT_WRITE_FAILED and RF_REVENT_FORCEREAD_FAILED.
This removes 3 more RF_PANIC()'s (but we'll currently still panic if any of these cases occur).
fix up a few printf's.
XXX: still needs more cleanup and testing (and be taught to not panic).


# 1.69 03-Mar-2004 oster

Cleanup function prototypes.


# 1.68 03-Mar-2004 oster

- cleanup memory allocation in rf_AllocPSStatus()
- change function signature of rf_LookupRUStatus(). The last argument
is now a pointer to a new PSS, in case one is needed. Rather than
having rf_LookupRUStatus() allocate a new PSS, we pre-allocate one
beforehand, where necessary, just in case.
- change callers of rf_lookupRUStatus() to deal with the new way of
calling rf_lookupRUStatus().

[no improvement or worsening of parity rebuild/initialization performance.]


# 1.67 01-Mar-2004 oster

Use RF_ACC_TRACE to #if out more chunks of code related only
to access tracing. (not turned on yet)


# 1.66 29-Feb-2004 oster

Adjust _rf_ShutdownCreate() so that it is willing to wait for more
memory. Since we only now ever "return(0)", just return (void)
instead.

Cleanup all uses of rf_ShutdownCreate() to not worry about
it ever failing. Shaves another 600 bytes off of an i386 GENERIC kernel.


# 1.65 04-Jan-2004 oster

raidPtr->reconControl->percentCompleted only gets used in one
debugging printf, and in rf_netbsdkintf.c. We can do the calculations
inside of RF_DEBUG_RECON for the one debugging printf, and only
perform the percentCompleted calculation "on demand" in the
rf_netbsdkintf.c case. Shaves a few more bytes off an i386 GENERIC
kernel, and ever-so-slightly decreases the amount of work performed
during a reconstruct.


# 1.64 31-Dec-2003 oster

Add in a bunch of RF_SIGNAL_COND()'s that were missing. Tidy up
a few lines.


# 1.63 31-Dec-2003 oster

Left-shift another else{} chunk. No functional changes.


# 1.62 31-Dec-2003 oster

left-shift the "else" part of the if(!lp_SubmitReconBuffer) condition.
Cleanup. No real functional changes, just more readable.


# 1.61 31-Dec-2003 oster

Negate a condition, and flip if/else parts. Preparation for left-shifting
the (now) else part. No real functional change.


# 1.60 30-Dec-2003 oster

Some days you wonder if some of the function declaration consistency
was just an accident in the first place. Cleanup function decls and
a few comments. [ok.. so I wasn't going to fix this many.. but once
you're on a roll....]


# 1.59 29-Dec-2003 oster

Let's see... raidPtr->recon_done_procs is never set to anything
(other than NULL when raidPtr is initialized). That means
SignalReconDone() never does anything useful. Bye-bye!

Say good-bye to recon_done_procs and recon_done_procs_mutex (and its
initializer) as well.


# 1.58 29-Dec-2003 oster

- first kick at a major reworking of RAIDframe's memory allocation code:
- all freelists converted to pools
- initialization of structure members in certain cases where
code was relying on specific allocation and usage properties
to keep structures in a "known state" (that doesn't work with
pools!).
- make most pool_get() be "PR_WAITOK" until they can be analyzed
further, and/or have proper error handling added.
- all RF_Mallocs zero the space returned, so there is no difference
between RF_Calloc and RF_Malloc. In fact, all the RF_Calloc()'s
do is tend to do is get things horribly confused.
Make RF_Malloc() the "general memory allocator", with
RF_MallocAndAdd() the "general memory allocator with
allocation list".
- some of these RF_Malloc's et al. are destined to disappear.
- remove rf_rdp_freelist entirely (it's not used anywhere!)
- remove: #include "rf_freelist.h"
- to the files that were relying on the above, add: #include "rf_general.h"
- add: #include "rf_debugMem.h" to rf_shutdown.h to make it happy
about the loss of: #include "rf_freelist.h".

This shrinks an i386 GENERIC kernel by approx 5K. RAIDframe now
weighs in at about 162K on i386.


# 1.57 29-Dec-2003 oster

[Having received a definite lack of strenuous objection, a small amount
of strenuous agreement, and some general agreement, this commit is
going ahead because it's now starting to block some other changes I
wish to make.]

Remove most of the support for the concept of "rows" from RAIDframe.
While the "row" interface has been exported to the world, RAIDframe
internals have really only supported a single row, even though they
have feigned support of multiple rows.

Nothing changes in configuration land -- config files still need to
specify a single row, etc. All auto-config structures remain fully
forward/backwards compatible.

The only visible difference to the average user should be a
reduction in the size of a GENERIC kernel (i386) by 4.5K. For those
of us trolling through RAIDframe kernel code, a lot of the driver
configuration code has become a LOT easier to read.


# 1.56 29-Jun-2003 fvdl

branches: 1.56.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.55 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.54 10-Apr-2003 simonb

Remove an assigned-to but unused variable.


# 1.53 21-Mar-2003 dsl

Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).


# 1.52 09-Feb-2003 jdolecek

constify some


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.51 19-Nov-2002 oster

For reconstructs, move checks for failed components to before the
kernel threads are created.


# 1.50 16-Nov-2002 oster

Cleanup more printfs.


# 1.49 15-Nov-2002 oster

After a rebuild-in-place, a reconstruct, or a copyback, we should
really be updating the component labels.


Revision tags: kqueue-aftermerge kqueue-beforemerge
# 1.48 18-Oct-2002 oster

Improve and/or re-arrange a number of locks. While much of the locking is
still a mess, and there are a number of unresolved issues here, this
gets us closer to being happier in LOCKDEBUG land.


# 1.47 06-Oct-2002 oster

Add a missing RF_LOCK_MUTEX().


# 1.46 06-Oct-2002 oster

Introduce a temp variable, and allocate the ReconCtrl structure before
we protect raidPtr. One less thing for LOCKDEBUG to complain about.


Revision tags: kqueue-base
# 1.45 23-Sep-2002 oster

Nuke "baddisk". Thanks to Simon B.


# 1.44 21-Sep-2002 oster

rf_RegisterReconDoneProc() isn't needed.

This is the last of the 'easy' ones that Krister made me aware of.
Total savings on i386 GENERIC kernel: 13151 bytes
RAIDframe in GENERIC is now at: 179033
Thanks again Krister!


# 1.43 19-Sep-2002 oster

Introduce and use RF_DEBUG_PSS, and save a few more bytes.


# 1.42 19-Sep-2002 oster

One signal will do, thanks.


# 1.41 17-Sep-2002 oster

Cast the RF_DEBUG_RECON net a little wider.


# 1.40 17-Sep-2002 oster

Rename RF_DEBUG_RECONBUFFER to RF_DEBUG_RECON in order to facilitate
disabling other stuff without having to introduce another #define.


# 1.39 16-Sep-2002 oster

Cleanup some comments.


# 1.38 16-Sep-2002 oster

rf_CheckFloatingRbufCount() is only really useful when debugging the
reconstruct buffer stuff. #if it out in the general case.


# 1.37 16-Sep-2002 oster

Cleanup some printf's, and disable some (debugging) output.


# 1.36 14-Sep-2002 oster

Everyone and their dog was using RF_ERRORMSG3 to print out the same
sort of error message, over and over again, in different files.
Rather than having the same text repeated in multiple .o files,
create a couple of little functions to do the printing, and save a
bundle of space. Also improves readability of code.


# 1.35 09-Sep-2002 oster

Disallow 'reconstruct-in-place' on a component that has failed
and has already been reconstructed to a hot spare.


Revision tags: gehenna-devsw-base
# 1.34 13-Jul-2002 oster

Nuke a redundant wakeup().


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.33 09-Jan-2002 oster

branches: 1.33.8;
Move a bunch of debugging stuff to be only used if DEBUG is turned on.


# 1.32 15-Nov-2001 lukem

don't need <sys/types.h> when including <sys/param.h>


# 1.31 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3
# 1.30 04-Oct-2001 oster

Step 2 of the disentanglement. We now look to <dev/raidframe/*> for
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.

No functional changes to RAIDframe.


Revision tags: thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.29 18-Jul-2001 thorpej

branches: 1.29.2;
bzero -> memset


# 1.28 14-Jun-2001 oster

branches: 1.28.2;
It's silly to need a parity rebuild after a reconstruction has completed.
If we've just reconstructed a disk, then the parity is known to
be correct. (XXX doesn't hold for RAID 6!)


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.27 26-Jan-2001 oster

branches: 1.27.2;
Ensure we update the 'partitionSize' field of the component labels
when doing a reconstruct or a copyback. If we don't, junk might be
there, and that could cause the component to be not correctly
autoconfigured on reboot. Thanks to Simon Burge for helping track this down.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.26 04-Jun-2000 oster

branches: 1.26.2;
Merge rf_update_component_labels() and rf_final_update_component_labels().


# 1.25 31-May-2000 oster

Oops.. reconstruction percentages were being reported incorrectly.
Thanks to Manuel Bouyer for noting this.


# 1.24 28-May-2000 oster

Umm.. Complete is not equal to 'left to do'. Fix the math.


# 1.23 28-May-2000 oster

- Add a mechanism for obtaining finer-grained 'progress' information
regarding reconstructs, copybacks, etc.

- RAID 0 doesn't do copybacks, but don't make raidctl sweat about it.


Revision tags: minoura-xpg4dl-base
# 1.22 13-Mar-2000 soren

branches: 1.22.2;
Fix doubled 'the's in comments.


# 1.21 07-Mar-2000 oster

Create a new rf_close_component() to handle vnode operations for closing
components. Teach rf_UnconfigureVnodes() how to use it, and tell
the copyback and reconstruction code about it too.


# 1.20 25-Feb-2000 oster

When we close autoconfigured components, we need to note that they
are no longer in 'autoconfigured' status.


# 1.19 25-Feb-2000 oster

Fix a (slightly) bogus status message.


# 1.18 24-Feb-2000 oster

Make sure we close auto-configured components appropriately when
attempting a rebuild-in-place.


# 1.17 23-Feb-2000 oster

Be more aggressive about updating component labels in the event
of a real component failure (or a simulated failure):
- add 'numNewFailures' to keep track of the number of disk failures
since mod_counter was last updated for each component label.
- make sure we call rf_update_component_labels() upon any component failure,
real or simulated.


# 1.16 23-Feb-2000 oster

Do a better job of (re)initializing the component labels after
a reconstruct or a copyback.


Revision tags: chs-ubc2-newbase
# 1.15 13-Feb-2000 oster

Get recent changes into the tree:
- make component_label variables more consistent (==> clabel)
- re-work incorrect component configuration code
- re-work disk configuration code
- cleanup initial configuration of raidPtr info
- add auto-detection of components and RAID sets (Disabled, for now)
- allow / on RAID sets (Disabled, for now)
- rename "config_disk_queue" to "rf_ConfigureDiskQueue" and properly prototype
in rf_diskqueue.h
- protect some headers with #if _KERNEL (XXX this needs to be fixed properly)
and cleanup header formatting.
- expand the component labels (yes, they should be backward/forward compatible)
- other bits and pieces (some function names are still bogus, and will get
changed soon)


# 1.14 09-Jan-2000 oster

Nuke dependencies on rf_cpuutils.h.


# 1.13 09-Jan-2000 oster

Nuke unused debugging stuff. Clean up a whole bunch of comments.


# 1.12 09-Jan-2000 oster

- move a bunch of function prototypes to rf_kintf.h
- general cleanup of a number of prototypes that were scattered around.


# 1.11 09-Jan-2000 oster

Nuke #if 0'ed code.


# 1.10 08-Jan-2000 oster

- nuke calls to rf_get_threadid() and associated #include
- change a bunch of debugging printfs from
"[%d] ...", tid (where tid is the "thread id")
to
"raid%d: ...", raidPtr->raidid
- other minor rototillage


# 1.9 05-Jan-2000 oster

- update RF_CREATE_THREAD to handle a 'process name' argument.
- fire up a new thread for parity re-writes, copybacks, and reconstructs.
The ioctl's which trigger these actions now return immediately.
- add progress accounting for the above actions.
- minor rototillage of rf_netbsdkintf.c to deal with all of the above.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base
# 1.8 14-Aug-1999 oster

branches: 1.8.2;
Remove a 'struct proc *'-passing abomination that's been bugging me
for quite some time.


# 1.7 13-Aug-1999 oster

rf_sys.h does not need to be #included in any of these files, and, actually,
is no longer needed at all.


# 1.6 13-Aug-1999 oster

Clean up reconstruction accounting a bit. While it worked before, it was
slightly broken in the case where the RAID set did not support reconstruction.


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 chs-ubc2-base netbsd-1-4-RELEASE netbsd-1-4-base
# 1.5 02-Mar-1999 oster

branches: 1.5.2;
Update for recent changes including component label support, clean
bits, rebuilding components in-place, adding hot spares, shutdownhooks, etc.


# 1.4 05-Feb-1999 oster

Phase 2 of the RAIDframe cleanup. The source is now closer to KNF
and is much easier to read. No functionality changes.


# 1.3 26-Jan-1999 oster

Nuke more bits of RAIDframe "demo" code. We're not "demoing" here,
we're doing the Real Thing!


# 1.2 26-Jan-1999 oster

RAIDframe cleanup, phase 1. Nuke simulator support, user-land driver,
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.

No functional changes to the kernel code in this commit.


Revision tags: kenh-if-detach-base
# 1.1 13-Nov-1998 oster

RAIDframe, version 1.1, from the Parallel Data Laboratory at
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.


Revision tags: isaki-audio2-base
# 1.122 09-Feb-2019 christos

- Change the allocation macros to be more like function calls
- Change sizeof(type) -> sizeof(*variable)
- Use macros for the long buffer length allocations
- Remove "bit polishing" memsets() -- do them only once
- Remove unnecessary casts

Thanks to oster@ for finding bugs and testing.


Revision tags: pgoyette-compat-merge-20190127 pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226 pgoyette-compat-1126 pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 jdolecek-ncqfixes-base pgoyette-compat-0728 netbsd-8-0-RELEASE phil-wifi-base pgoyette-compat-0625 netbsd-8-0-RC2 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 netbsd-8-0-RC1 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226 nick-nhusb-base-20150921 nick-nhusb-base-20150606 nick-nhusb-base-20150406 nick-nhusb-base
# 1.121 14-Nov-2014 oster

Fix a long-standing bug related to rebooting while a
reconstruct-to-spare is underway but not yet complete.

The issue was that a component was being marked as a used_spare when
the rebuild started, not when the rebuild was actually finished.
Marking it as a used_spare meant that the component label on the spare
was being updated such that after a reboot the component would be
considered up-to-date, regardless of whether the rebuild actually
completed!

This fix includes:
1) Add an additional state "rf_ds_rebuilding_spare" which is used
to denote that a spare is currently being rebuilt from the live
components.
2) Update the comments on the disk states, which were out-of-sync
with reality.
3) When rebuilding to a spare component, that spare now enters the
state rf_ds_rebuilding_spare instead of the state rf_ds_used_spare.
4) When the rebuild is actually complete then the spare component
enters the rf_ds_used_spare state. rf_ds_used_spare is now used
exclusively for the case where the rebuilding to the spare has
completed successfully.

XXX: Someday we need to teach raidctl(8) about this new state, and
take out the backwards compatibility code in rf_netbsdkintf.c (see
RAIDFRAME_GET_INFO in raidioctl()). For today, this fix needs to be
generic enough that it can get backported without major grief.

XXX: Needs pullup to netbsd-5*, netbsd-6*, and netbsd-7

Fixes PR#49244.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.120 14-Jun-2014 hannken

branches: 1.120.2;
Change dk_lookup() to return an anonymous vnode not associated with
any file system. Change all consumers of dk_lookup() to get the
device from "v_rdev" instead of VOP_GETATTR() as specfs does not
support VOP_GETATTR(). Devices obtained with dk_lookup() will no
longer disappear on forced unmounts.

Fix for PR kern/48849 (root mirror raid fails on shutdown)

Welcome to 6.99.44


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base rmind-smpnet-base agc-symver-base
# 1.119 06-Mar-2013 yamt

branches: 1.119.10;
fix parens in a message


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6 jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.118 20-Feb-2012 oster

branches: 1.118.2;
Add logic to the main reconstruction loop to handle RAID5 with rotated
spares. While here, observe that we were actually doing one more
stripe than we thought we were, and correct that too (it didn't matter
for non-RAID5_RS, but it definitely does for RAID5_RS). Add some
bounds-checking at the beginning to handle the case where the number
of stripes in the set is smaller than the sliding reconstruction window.

XXX: this problem likely needs to be fixed for PARITY_DECLUSTERING too.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.117 14-Oct-2011 hannken

branches: 1.117.2; 1.117.6; 1.117.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.116 03-Aug-2011 oster

Address part of PR kern/44972. From YAMAMOTO Takashi. Thanks!


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.115 28-May-2011 yamt

rf_ReconstructInPlace: don't leave a vnode open on errors.
fixes a part of PR/44972.


# 1.114 24-May-2011 buhrow

Suggested to oster@ and approved via private e-mail as a help to
people who are getting reconstruction failures.


# 1.113 11-May-2011 mrg

convert the main raidPtr mutex to a kmutex, and add a couple of cv's to
cover the old sleep/wakeup points for adding_hot_spare and waitForReconCond.
convert all remaining simple_lock's to kmutexes (they're not used or compiled
right now... even with all options enabled) and remove the support for them.

this leaves just a pair of tsleep()/wakeup() calls using old scheduling APIs.


# 1.112 02-May-2011 mrg

convert rb_mutex to a kmutex/cv.


Revision tags: bouyer-quota2-nbase
# 1.111 19-Feb-2011 enami

Define accessors for number of blocks and partition size in the
component label and use them where appropriate. Disscussed on tech-kern.


Revision tags: bouyer-quota2-base jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.110 19-Nov-2010 dholland

branches: 1.110.2; 1.110.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4
# 1.109 01-Nov-2010 mrg

add support for >2TB raid devices.

- add two new members to the component label:
u_int numBlocksHi
u_int partitionSizeHi
and store the top 32 bits of the real number of blocks and
partition size. modify rf_print_component_label(),
rf_does_it_fit(), rf_AutoConfigureDisks() and
rf_ReconstructFailedDiskBasic().

- call disk_blocksize() after disk_attach() [ from mlelstv ]

- shift the block number relative to DEV_BSHIFT in raidstart()
and InitBP() so that accesses work for non 512-byte devices.
[ from mlelstv ]

- update rf_getdisksize() to use the new getdisksize() [ from
mlelstv. this part needs a separate change for netbsd-5. ]


reviewed by: oster, christos and darrenr


Revision tags: uebayasi-xip-base3 yamt-nfs-mp-base11 uebayasi-xip-base2 yamt-nfs-mp-base10 uebayasi-xip-base1 yamt-nfs-mp-base9 uebayasi-xip-base matt-premerge-20091211
# 1.108 17-Nov-2009 jld

branches: 1.108.2; 1.108.4;
Finally commit the RAIDframe parity map Summer Of Code project.

Drastically reduces the amount of time spent rewriting parity after an
unclean shutdown by keeping better track of which regions might have had
outstanding writes. Enabled by default; can be disabled on a per-set
basis, or tuned, with the new raidctl(8) commands.

Discussed on tech-kern@ to a general air of approval; exhortations to
commit from mrg@, christos@, and others.

Thanks to Google for their sponsorship, oster@ for mentoring the
project, assorted developers for trying very hard to break it, and
probably more I'm forgetting.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jymxensuspend-base yamt-nfs-mp-base6 yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-nbase yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base2 jym-xensuspend-base nick-hppapmap-base
# 1.107 11-Feb-2009 oster

If we see a RF_RECON_WRITE_ERROR event we know a write has finished and
we need to account for that. Failure to do so means we can end up
waiting forever for writes we think are outstanding, but which have
already completed.

Addresses the RAIDframe part of PR#40569. Thanks to Matthias Scheler
for reporting the issue and verifying the fix.


Revision tags: mjf-devfs2-base
# 1.106 20-Dec-2008 oster

branches: 1.106.2;
When unconfiguring an array where a reconstruct is in progress, abort
the reconstruct and wait for IOs to drain before pulling the plug.

Should fix the panic reported by der Mouse on tech-kern.


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4 haad-dm-base
# 1.105 23-Sep-2008 oster

branches: 1.105.2; 1.105.4;
Nuke unneeded printf(). Spotted by pooka@.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 wrstuden-revivesa-base-1 simonb-wapbl-nbase yamt-pf42-base4 simonb-wapbl-base yamt-pf42-base3 hpcarm-cleanup-nbase wrstuden-revivesa-base
# 1.104 19-May-2008 oster

branches: 1.104.4;
Re-work some of the guts of the reconstruction code.

Reconmap used to have one pointer for every reconstruction unit. This
does not scale well in the land of 1TB disks, where some 100MB+ of
"status pointers" are required for typical configurations. Convert
the reconstruction code to use a "sliding status window" which will
scale nicely regardless of the number of stripes/reconstruction units
in the RAID set. Convert the main reconstruction loop to rebuild the
array in chunks rather than in one big lump.

As part of these changes, introduce a function to kick any waiters on
the head separation callback list, and use that in the main
reconstruction event queue to wake up the waiters if things have
stalled. (I believe this may fix a race condition that could occur at
at least at the very end of a disk during reconstruction under heavy
IO load.)

Thanks to Brian Buhrow for all his help, support, and patience in
testing these changes.


Revision tags: yamt-pf42-baseX yamt-pf42-base2 yamt-nfs-mp-base2 yamt-nfs-mp-base yamt-pf42-base
# 1.103 15-Apr-2008 oster

branches: 1.103.2; 1.103.4; 1.103.6;
A forced recon read should not default to indicating that the reads
for that disk have stopped, since this will bump us out of the normal
reconstruction loop prematurely.

Fixes the (mostly cosmetic) bug where the reconstruction
status values stop updating, and from raidctl it appears that
reconstruction has totally stalled (which it actually hasn't -- the
reconstruction does complete properly, but not in the normal way).


# 1.102 14-Apr-2008 oster

Print out the status value if a reconstruction read fails.
Don't print out write promotions during reconstruct unless
we are debugging reconstructs.


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14 keiichi-mipv6-nbase nick-net80211-sync-base keiichi-mipv6-base matt-armv6-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.101 26-Jan-2008 oster

branches: 1.101.6;
In a land before time, when kernel processes roamed the system, we
needed to keep track of the kernel process that opened a device in
order to close it with the right credentials. Flash forward to today
where curlwp is now quite sufficient.


Revision tags: bouyer-xeni386-merge1 vmlocking2-base3 bouyer-xeni386-nbase yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 bouyer-xeni386-base yamt-kmem-base vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase matt-armv6-base jmcneill-pm-base reinoud-bufcleanup-base
# 1.100 26-Nov-2007 pooka

Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 yamt-x86pmap-base2 yamt-x86pmap-base vmlocking-base
# 1.99 21-Sep-2007 oster

branches: 1.99.6;
Fix wording in a comment and correct a debug line. From Olivier Cherrier
(via private mail). Thanks!


Revision tags: nick-csl-alignment-base5 matt-mips64-base
# 1.98 18-Jul-2007 ad

branches: 1.98.4; 1.98.6; 1.98.8;
Fix fallout from recent kthread changes.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.97 09-Jul-2007 ad

branches: 1.97.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.96 26-Jun-2007 cube

Change dk_lookup() to accept an additional argument of the type enum uio_seg
that tells whether the given path is in user space or kernel space, so it
can tell NDINIT().

While the raidframe calls were ok, both ccd(4) and cgd(4) were passing
pointers to user space data, which leads to strange error on i386, as
reported by Jukka Salmi on current-users.

The issue has been there since last august, I'm actually a bit surprised
that no one in the meantime has used ccd(4) or cgd(4) on an arch where it
would have simply faulted.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base ad-audiomp-base post-newlock2-merge newlock2-nbase yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 newlock2-base netbsd-4-base
# 1.95 16-Nov-2006 christos

branches: 1.95.2; 1.95.8; 1.95.10; 1.95.16;
__unused removal on arguments; approved by core.


Revision tags: yamt-splraiseipl-base2
# 1.94 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 yamt-pdpolicy-base8 rpaulo-netinet-merge-pcb-base
# 1.93 27-Aug-2006 christos

branches: 1.93.2; 1.93.4;
- use dk_lookup instead of our home-spun version.
- allow raid to be configured in a wedge
- allow wedges to be configured in a raid
- add autoconfiguration of wedges in a raid


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.92 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.91 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 peter-altq-base yamt-pdpolicy-base2 elad-kernelauth-base yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.90 11-Dec-2005 christos

branches: 1.90.4; 1.90.6; 1.90.8; 1.90.10; 1.90.12;
merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.89 18-Jul-2005 oster

If rf_SubmitReconBuffer indicates the submission was blocked (for
whatever reason), return 0 instead of the default
RF_RECON_READ_STOPPED. Returning RF_RECON_READ_STOPPED would result
in rf_ContinueReconstructFailedDisk() thinking that the given
component was "done" and breaking out of the main reconstruction loop
far too early. Reconstruction still worked correctly as long as there
were no errors, but RAIDframe wouldn't be in a position to properly
handle read/write errors during reconstruction.

This fixes the "raidctl's progress bar spins at 0% until
reconstruction finishes" problem.


# 1.88 08-Jun-2005 oster

branches: 1.88.2;
- initialize numRUsTotal before we indicate that we are doing a reconstruct.

- make numRUsComplete and numRUsTotal 64-bit quantities like
everything else that records this information.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base kent-audio2-base
# 1.87 27-Feb-2005 perry

branches: 1.87.2;
nuke trailing whitespace


Revision tags: yamt-km-base2
# 1.86 12-Feb-2005 oster

The 'next' argument to rf_CreateDiskQueueData is always NULL. Since
there is no particular reason to pass an extra NULL argument, turf it,
and initialize p->next to NULL within the function.


# 1.85 12-Feb-2005 oster

Add a 'waitflag' argument to rf_CreateDiskQueueData() and use it to
determine if we are willing to wait for memory to come from the
diskqueuedata (dqd) and bufpool pools. Cleanup the mess related to
code calling rf_CreateDiskQueueData() with different expectations
(and/or blatent disregard) of what might happen if there were
insufficient pool resources.


# 1.84 06-Feb-2005 oster

It's not a bad idea to update the component labels whether or not the
reconstruction was successful.


# 1.83 05-Feb-2005 oster

rf_GetNextReconEvent() *will* return a valid event, so no need for
the assert. (we'd have panic'ed in there long before this assert
if that wasn't the case).

Minor whitespace changes.


# 1.82 05-Feb-2005 oster

Vastly improve the error handling in the case of a read/write error
that occurs during a reconstruction. We go from zero error handling
and likely panicing if something goes amiss, to gracefully bailing and
leaving the system in the best, usable state possible.

- introduce rf_DrainReconEventQueue() to allow easy cleaning of the
reconstruction event queue

- change how we cleanup the floating recon buffers in
rf_FreeReconControl(). Detect the end of the list rather
than traversing according to a count.

- keep track of the number of pending reconstruction writes. In the
event of a read error, use this to wait long enough for the pending
writes to (hopefully) drain.

- more cleanup is still needed on this code, but I didn't want to
start mixing major functional changes with minor cleanups.

XXX: There is a known issue with pool items left outstanding due to
the IO failure, and this can show up in the form of a panic at the
tail end of a shutdown. This problem is much less severe than before
these changes, and the hope/plan is that this problem will go away
once this code gets overhauled again.


Revision tags: yamt-km-base
# 1.81 22-Jan-2005 oster

branches: 1.81.2;
Torch some #define's missed in last commit.


# 1.80 22-Jan-2005 oster

Reconstruction Descriptors are only allocated once per reconstruction,
and don't need their own pool or freelist or anything fancier than a
malloc/free.


# 1.79 18-Jan-2005 oster

ForceReconReadDoneProc() needs a return after doing the first
rf_CauseReconEvent().


Revision tags: kent-audio1-beforemerge
# 1.78 12-Dec-2004 oster

branches: 1.78.2;
The switch() in rf_ContinueReconstructFailedDisk() is never actually
used in non-simulation code, and thus is just wasting space (and
making the code more confusing to read!). Turf the switch, left-shift
the indentation of code, and nuke 'state' field of struct RF_RaidReconDesc_s.

No real functional changes.


Revision tags: kent-audio1-base
# 1.77 15-Nov-2004 oster

continueFunc and continueArg arn't used. Turf. Simplify calls to
rf_GetNextReconEvent().


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.76 18-Mar-2004 oster

branches: 1.76.4;
Re-work the locking mechanisms for reconstruct and PSS structures
such that we don't actually hold a simplelock while we are doing
a pool_get(), but that we still effectively protecting critical code.

This should fix all of the outstanding LOCKDEBUG warnings related to
rebuilding RAID sets.


# 1.75 13-Mar-2004 oster

- don't use rf_PrintUserStats() for recon statistics.
rf_PrintUserStats() was mean for the simulator, and doesn't provide
any real info in kernel-space, especially for reconstructs.
Reconstructing actually renders the stats even more useless, since it
resets them all to zero before the reconstruct starts!

- since rf_PrintUserStats() is no longer used, nuke it along with the
routines that feed it. Nothing was using this code, and if we ever
need it again, we know where to find it.


# 1.74 07-Mar-2004 oster

- Introduce rf_pools which contains all of the various global pools used
by RAIDframe. Convert all other RAIDframe global pools to use pools
defined within this new structure.
- Introduce rf_pool_init(), used for initializing a single pool in
RAIDframe. Teach each of the configuration routines to use
rf_pool_init().
- Cleanup a few pool-related comments.
- Cleanup revent initialization and #defines.
- Add a missing pool_destroy() for the reconbuffer pool.

(Saves another 1K off of an i386 GENERIC kernel, and makes
stuff a lot more readable)


# 1.73 07-Mar-2004 oster

- fix up initialization of rf_recond_pool
- introduce rf_reconbuffer_pool and teach rf_MakeReconBuffer() to use it


# 1.72 05-Mar-2004 oster

Use RF_INCLUDE_PARITY_DECLUSTERING_DS to #if-out more unneeded bits.
(We can't do RF_DISTRIBUTE_SPARE bits without the parity declustering stuff.)


# 1.71 03-Mar-2004 oster

Nuke some unnecessary casts. No functional changes.


# 1.70 03-Mar-2004 oster

Introduce RF_REVENT_READ_FAILED, RF_REVENT_WRITE_FAILED and RF_REVENT_FORCEREAD_FAILED.
This removes 3 more RF_PANIC()'s (but we'll currently still panic if any of these cases occur).
fix up a few printf's.
XXX: still needs more cleanup and testing (and be taught to not panic).


# 1.69 03-Mar-2004 oster

Cleanup function prototypes.


# 1.68 03-Mar-2004 oster

- cleanup memory allocation in rf_AllocPSStatus()
- change function signature of rf_LookupRUStatus(). The last argument
is now a pointer to a new PSS, in case one is needed. Rather than
having rf_LookupRUStatus() allocate a new PSS, we pre-allocate one
beforehand, where necessary, just in case.
- change callers of rf_lookupRUStatus() to deal with the new way of
calling rf_lookupRUStatus().

[no improvement or worsening of parity rebuild/initialization performance.]


# 1.67 01-Mar-2004 oster

Use RF_ACC_TRACE to #if out more chunks of code related only
to access tracing. (not turned on yet)


# 1.66 29-Feb-2004 oster

Adjust _rf_ShutdownCreate() so that it is willing to wait for more
memory. Since we only now ever "return(0)", just return (void)
instead.

Cleanup all uses of rf_ShutdownCreate() to not worry about
it ever failing. Shaves another 600 bytes off of an i386 GENERIC kernel.


# 1.65 04-Jan-2004 oster

raidPtr->reconControl->percentCompleted only gets used in one
debugging printf, and in rf_netbsdkintf.c. We can do the calculations
inside of RF_DEBUG_RECON for the one debugging printf, and only
perform the percentCompleted calculation "on demand" in the
rf_netbsdkintf.c case. Shaves a few more bytes off an i386 GENERIC
kernel, and ever-so-slightly decreases the amount of work performed
during a reconstruct.


# 1.64 31-Dec-2003 oster

Add in a bunch of RF_SIGNAL_COND()'s that were missing. Tidy up
a few lines.


# 1.63 31-Dec-2003 oster

Left-shift another else{} chunk. No functional changes.


# 1.62 31-Dec-2003 oster

left-shift the "else" part of the if(!lp_SubmitReconBuffer) condition.
Cleanup. No real functional changes, just more readable.


# 1.61 31-Dec-2003 oster

Negate a condition, and flip if/else parts. Preparation for left-shifting
the (now) else part. No real functional change.


# 1.60 30-Dec-2003 oster

Some days you wonder if some of the function declaration consistency
was just an accident in the first place. Cleanup function decls and
a few comments. [ok.. so I wasn't going to fix this many.. but once
you're on a roll....]


# 1.59 29-Dec-2003 oster

Let's see... raidPtr->recon_done_procs is never set to anything
(other than NULL when raidPtr is initialized). That means
SignalReconDone() never does anything useful. Bye-bye!

Say good-bye to recon_done_procs and recon_done_procs_mutex (and its
initializer) as well.


# 1.58 29-Dec-2003 oster

- first kick at a major reworking of RAIDframe's memory allocation code:
- all freelists converted to pools
- initialization of structure members in certain cases where
code was relying on specific allocation and usage properties
to keep structures in a "known state" (that doesn't work with
pools!).
- make most pool_get() be "PR_WAITOK" until they can be analyzed
further, and/or have proper error handling added.
- all RF_Mallocs zero the space returned, so there is no difference
between RF_Calloc and RF_Malloc. In fact, all the RF_Calloc()'s
do is tend to do is get things horribly confused.
Make RF_Malloc() the "general memory allocator", with
RF_MallocAndAdd() the "general memory allocator with
allocation list".
- some of these RF_Malloc's et al. are destined to disappear.
- remove rf_rdp_freelist entirely (it's not used anywhere!)
- remove: #include "rf_freelist.h"
- to the files that were relying on the above, add: #include "rf_general.h"
- add: #include "rf_debugMem.h" to rf_shutdown.h to make it happy
about the loss of: #include "rf_freelist.h".

This shrinks an i386 GENERIC kernel by approx 5K. RAIDframe now
weighs in at about 162K on i386.


# 1.57 29-Dec-2003 oster

[Having received a definite lack of strenuous objection, a small amount
of strenuous agreement, and some general agreement, this commit is
going ahead because it's now starting to block some other changes I
wish to make.]

Remove most of the support for the concept of "rows" from RAIDframe.
While the "row" interface has been exported to the world, RAIDframe
internals have really only supported a single row, even though they
have feigned support of multiple rows.

Nothing changes in configuration land -- config files still need to
specify a single row, etc. All auto-config structures remain fully
forward/backwards compatible.

The only visible difference to the average user should be a
reduction in the size of a GENERIC kernel (i386) by 4.5K. For those
of us trolling through RAIDframe kernel code, a lot of the driver
configuration code has become a LOT easier to read.


# 1.56 29-Jun-2003 fvdl

branches: 1.56.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.55 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.54 10-Apr-2003 simonb

Remove an assigned-to but unused variable.


# 1.53 21-Mar-2003 dsl

Use 'void *' instead of 'caddr_t' in prototypes of VOP_IOCTL, VOP_FCNTL
and VOP_ADVLOCK, delete casts from callers (and some to copyin/out).


# 1.52 09-Feb-2003 jdolecek

constify some


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.51 19-Nov-2002 oster

For reconstructs, move checks for failed components to before the
kernel threads are created.


# 1.50 16-Nov-2002 oster

Cleanup more printfs.


# 1.49 15-Nov-2002 oster

After a rebuild-in-place, a reconstruct, or a copyback, we should
really be updating the component labels.


Revision tags: kqueue-aftermerge kqueue-beforemerge
# 1.48 18-Oct-2002 oster

Improve and/or re-arrange a number of locks. While much of the locking is
still a mess, and there are a number of unresolved issues here, this
gets us closer to being happier in LOCKDEBUG land.


# 1.47 06-Oct-2002 oster

Add a missing RF_LOCK_MUTEX().


# 1.46 06-Oct-2002 oster

Introduce a temp variable, and allocate the ReconCtrl structure before
we protect raidPtr. One less thing for LOCKDEBUG to complain about.


Revision tags: kqueue-base
# 1.45 23-Sep-2002 oster

Nuke "baddisk". Thanks to Simon B.


# 1.44 21-Sep-2002 oster

rf_RegisterReconDoneProc() isn't needed.

This is the last of the 'easy' ones that Krister made me aware of.
Total savings on i386 GENERIC kernel: 13151 bytes
RAIDframe in GENERIC is now at: 179033
Thanks again Krister!


# 1.43 19-Sep-2002 oster

Introduce and use RF_DEBUG_PSS, and save a few more bytes.


# 1.42 19-Sep-2002 oster

One signal will do, thanks.


# 1.41 17-Sep-2002 oster

Cast the RF_DEBUG_RECON net a little wider.


# 1.40 17-Sep-2002 oster

Rename RF_DEBUG_RECONBUFFER to RF_DEBUG_RECON in order to facilitate
disabling other stuff without having to introduce another #define.


# 1.39 16-Sep-2002 oster

Cleanup some comments.


# 1.38 16-Sep-2002 oster

rf_CheckFloatingRbufCount() is only really useful when debugging the
reconstruct buffer stuff. #if it out in the general case.


# 1.37 16-Sep-2002 oster

Cleanup some printf's, and disable some (debugging) output.


# 1.36 14-Sep-2002 oster

Everyone and their dog was using RF_ERRORMSG3 to print out the same
sort of error message, over and over again, in different files.
Rather than having the same text repeated in multiple .o files,
create a couple of little functions to do the printing, and save a
bundle of space. Also improves readability of code.


# 1.35 09-Sep-2002 oster

Disallow 'reconstruct-in-place' on a component that has failed
and has already been reconstructed to a hot spare.


Revision tags: gehenna-devsw-base
# 1.34 13-Jul-2002 oster

Nuke a redundant wakeup().


Revision tags: netbsd-1-6-PATCH002-RELEASE netbsd-1-6-PATCH002 netbsd-1-6-PATCH002-RC4 netbsd-1-6-PATCH002-RC3 netbsd-1-6-PATCH002-RC2 netbsd-1-6-PATCH002-RC1 netbsd-1-6-PATCH001 netbsd-1-6-PATCH001-RELEASE netbsd-1-6-PATCH001-RC3 netbsd-1-6-PATCH001-RC2 netbsd-1-6-PATCH001-RC1 netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base ifpoll-base
# 1.33 09-Jan-2002 oster

branches: 1.33.8;
Move a bunch of debugging stuff to be only used if DEBUG is turned on.


# 1.32 15-Nov-2001 lukem

don't need <sys/types.h> when including <sys/param.h>


# 1.31 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base thorpej-devvp-base3
# 1.30 04-Oct-2001 oster

Step 2 of the disentanglement. We now look to <dev/raidframe/*> for
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.

No functional changes to RAIDframe.


Revision tags: thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.29 18-Jul-2001 thorpej

branches: 1.29.2;
bzero -> memset


# 1.28 14-Jun-2001 oster

branches: 1.28.2;
It's silly to need a parity rebuild after a reconstruction has completed.
If we've just reconstructed a disk, then the parity is known to
be correct. (XXX doesn't hold for RAID 6!)


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.27 26-Jan-2001 oster

branches: 1.27.2;
Ensure we update the 'partitionSize' field of the component labels
when doing a reconstruct or a copyback. If we don't, junk might be
there, and that could cause the component to be not correctly
autoconfigured on reboot. Thanks to Simon Burge for helping track this down.


Revision tags: netbsd-1-5-RELEASE netbsd-1-5-BETA2 netbsd-1-5-BETA netbsd-1-5-ALPHA2 netbsd-1-5-base
# 1.26 04-Jun-2000 oster

branches: 1.26.2;
Merge rf_update_component_labels() and rf_final_update_component_labels().


# 1.25 31-May-2000 oster

Oops.. reconstruction percentages were being reported incorrectly.
Thanks to Manuel Bouyer for noting this.


# 1.24 28-May-2000 oster

Umm.. Complete is not equal to 'left to do'. Fix the math.


# 1.23 28-May-2000 oster

- Add a mechanism for obtaining finer-grained 'progress' information
regarding reconstructs, copybacks, etc.

- RAID 0 doesn't do copybacks, but don't make raidctl sweat about it.


Revision tags: minoura-xpg4dl-base
# 1.22 13-Mar-2000 soren

branches: 1.22.2;
Fix doubled 'the's in comments.


# 1.21 07-Mar-2000 oster

Create a new rf_close_component() to handle vnode operations for closing
components. Teach rf_UnconfigureVnodes() how to use it, and tell
the copyback and reconstruction code about it too.


# 1.20 25-Feb-2000 oster

When we close autoconfigured components, we need to note that they
are no longer in 'autoconfigured' status.


# 1.19 25-Feb-2000 oster

Fix a (slightly) bogus status message.


# 1.18 24-Feb-2000 oster

Make sure we close auto-configured components appropriately when
attempting a rebuild-in-place.


# 1.17 23-Feb-2000 oster

Be more aggressive about updating component labels in the event
of a real component failure (or a simulated failure):
- add 'numNewFailures' to keep track of the number of disk failures
since mod_counter was last updated for each component label.
- make sure we call rf_update_component_labels() upon any component failure,
real or simulated.


# 1.16 23-Feb-2000 oster

Do a better job of (re)initializing the component labels after
a reconstruct or a copyback.


Revision tags: chs-ubc2-newbase
# 1.15 13-Feb-2000 oster

Get recent changes into the tree:
- make component_label variables more consistent (==> clabel)
- re-work incorrect component configuration code
- re-work disk configuration code
- cleanup initial configuration of raidPtr info
- add auto-detection of components and RAID sets (Disabled, for now)
- allow / on RAID sets (Disabled, for now)
- rename "config_disk_queue" to "rf_ConfigureDiskQueue" and properly prototype
in rf_diskqueue.h
- protect some headers with #if _KERNEL (XXX this needs to be fixed properly)
and cleanup header formatting.
- expand the component labels (yes, they should be backward/forward compatible)
- other bits and pieces (some function names are still bogus, and will get
changed soon)


# 1.14 09-Jan-2000 oster

Nuke dependencies on rf_cpuutils.h.


# 1.13 09-Jan-2000 oster

Nuke unused debugging stuff. Clean up a whole bunch of comments.


# 1.12 09-Jan-2000 oster

- move a bunch of function prototypes to rf_kintf.h
- general cleanup of a number of prototypes that were scattered around.


# 1.11 09-Jan-2000 oster

Nuke #if 0'ed code.


# 1.10 08-Jan-2000 oster

- nuke calls to rf_get_threadid() and associated #include
- change a bunch of debugging printfs from
"[%d] ...", tid (where tid is the "thread id")
to
"raid%d: ...", raidPtr->raidid
- other minor rototillage


# 1.9 05-Jan-2000 oster

- update RF_CREATE_THREAD to handle a 'process name' argument.
- fire up a new thread for parity re-writes, copybacks, and reconstructs.
The ioctl's which trigger these actions now return immediately.
- add progress accounting for the above actions.
- minor rototillage of rf_netbsdkintf.c to deal with all of the above.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base comdex-fall-1999-base fvdl-softdep-base
# 1.8 14-Aug-1999 oster

branches: 1.8.2;
Remove a 'struct proc *'-passing abomination that's been bugging me
for quite some time.


# 1.7 13-Aug-1999 oster

rf_sys.h does not need to be #included in any of these files, and, actually,
is no longer needed at all.


# 1.6 13-Aug-1999 oster

Clean up reconstruction accounting a bit. While it worked before, it was
slightly broken in the case where the RAID set did not support reconstruction.


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 chs-ubc2-base netbsd-1-4-RELEASE netbsd-1-4-base
# 1.5 02-Mar-1999 oster

branches: 1.5.2;
Update for recent changes including component label support, clean
bits, rebuilding components in-place, adding hot spares, shutdownhooks, etc.


# 1.4 05-Feb-1999 oster

Phase 2 of the RAIDframe cleanup. The source is now closer to KNF
and is much easier to read. No functionality changes.


# 1.3 26-Jan-1999 oster

Nuke more bits of RAIDframe "demo" code. We're not "demoing" here,
we're doing the Real Thing!


# 1.2 26-Jan-1999 oster

RAIDframe cleanup, phase 1. Nuke simulator support, user-land driver,
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.

No functional changes to the kernel code in this commit.


Revision tags: kenh-if-detach-base
# 1.1 13-Nov-1998 oster

RAIDframe, version 1.1, from the Parallel Data Laboratory at
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.