History log of /netbsd-current/sys/dev/vnd.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.289 19-May-2023 mlelstv

Neither limit the number of requests for the page daemon.
Otherwise you may deadlock when the backend needs to allocate
memory and the page daemon needs to flush dirty vnd buffers.

See PR 57421 for details.


# 1.288 14-Mar-2023 hannken

Do not limit the number of pending requests for the worker thread.

With wedge on vnd it prevents a deadlock when requests get queued with
biodone() -> dkstart() -> vndstrategy().

Fixes PR kern/57263 "vnd locks up when using vn_rdwr"


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.287 04-Sep-2022 mlelstv

branches: 1.287.4;
revert 1.281
VNDIOCLR requires write access to unconfigure a unit, even when the unit
is read-only.


# 1.286 31-May-2022 riastradh

vnd(4): Work around deadlock in VNDIOCCLR.

Since the changes this year to eliminate a host of races and
deadlocks in open, close, revoke, attach, and detach, closing the
last instance of a device special node has the side effect of waiting
for all concurrent I/O operations (read, write, ioctl, strategy, &c.)
on the device to complete.

Unfortunately, while this works for physical devices which revoke
open device nodes in their autoconf detach functions, as invoked by
some hardware interrupt indicating that the device is no longer
present, pseudo-devices like vnd(4) work differently -- or, work by
luck, or don't work any more.

VNDIOCCLR acts kind of like an autoconf detach function in that it
revokes open device nodes, which closes the last instance. But
VNDIOCCLR is itself called via ioctl, which is an I/O operation that
close waits for. So we end up with a deadlock, spec_io_drain waiting
for spec_close lower down in the call stack:

> spec_io_drain() at netbsd:spec_io_drain+0x84
> spec_close() at netbsd:spec_close+0x1c6
> VOP_CLOSE() at netbsd:VOP_CLOSE+0x38
> spec_node_revoke() at netbsd:spec_node_revoke+0x14d
> vcache_reclaim() at netbsd:vcache_reclaim+0x4e7
> vgone() at netbsd:vgone+0xcd
> vrevoke() at netbsd:vrevoke+0xfa
> genfs_revoke() at netbsd:genfs_revoke+0x13
> VOP_REVOKE() at netbsd:VOP_REVOKE+0x35
> vdevgone() at netbsd:vdevgone+0x64
> vnddoclear.part.0() at netbsd:vnddoclear.part.0+0xaa
> vndioctl() at netbsd:vndioctl+0x78c
> bdev_ioctl() at netbsd:bdev_ioctl+0x91
> spec_ioctl() at netbsd:spec_ioctl+0xa5
> VOP_IOCTL() at netbsd:VOP_IOCTL+0x41
> vn_ioctl() at netbsd:vn_ioctl+0xb3
> sys_ioctl() at netbsd:sys_ioctl+0x555

In the past, there was a workaround for what was presumably a crash
instead of a deadlock here: don't issue revoke (vdevgone) on the open
character devices for the minor number in use by the ioctl. If you
use, e.g., `vnconfig -u vnd0', and vnconfig(8) picks /dev/rvnd0c or
/dev/rvnd0d, that special case kicks in. But if you use `vnconfig -u
/dev/vnd0d', the ioctl will be issued on the block device instead, so
the special case doesn't kick in, so the operation deadlocks.

It is actually probably safe not to revoke the block device if what
the ioctl caller holds open is that, because specfs(9) forbids more
than one open of a block device, so nothing else can have it open
anyway.

Unclear what the consequences of failing to revoke the character
device are -- but this is what vnd(4) has done all along. cgd(4) and
ccd(4) also don't bother to revoke. We don't have a notion of
`revoke every file descriptor _except_ this one'; only a vnode as a
whole can be revoked, including all references to it.

This is a stop-gap measure to avoid a deadlock we are definitely
hitting on some users. A slightly better measure would be to revoke
the block or character device according to which one is being used,
but that requires a little more work with two different d_ioctl
functions -- and wouldn't address isues with the character device. A
proper solution requires identifying the appropriate protocol for all
of these pseudo-device disk drivers and using it uniformly for them.

Reported on current-users:
https://mail-index.netbsd.org/current-users/2022/05/27/msg042437.html


# 1.285 31-Mar-2022 pgoyette

For device modules that provide both auto-config and /dev/xxx
interfaces, make sure that initialization and destruction
follow the proper sequence. This is triggered by the recent
changes to the devsw stuff; per riastradh@ the required call
sequence is:

devsw_attach()
config_init_component() or config_cf*_attach()
...
config_fini_component() or config_cf*_detach()
devsw_detach()

While here, add a few missing calls to some of the detach
routines.

Testing of these changes has been limited to:
1. compile without build break
2. no related test failures from atf
3. modload/modunload work as well as
before.

No functional device testing done, since I don't have any
of these devices. Let me know of any damage I might cause
here!

XXX Some of the modules affected by this commit are already
XXX broken; see kern/56772. This commit does not break
any additional modules (as far as I know).


# 1.284 28-Mar-2022 mlelstv

Check INITED state by default for all ioctls but VNDIOCSET. Avoids crashes
with disk_ioctls on default unit, which is not INITED.
Fixes PR 56700.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.283 24-Jul-2021 andvar

Fix all remaining typos, mainly in comments but also in few definitions and log messages, reported by me in PR kern/54889.
Also fixed some additional typos in comments, found on review of same files or typos.


# 1.282 29-Jun-2021 dholland

Add containment for the cloning devices hack in vn_open.

Cloning devices (and also things like /dev/stderr) work by allocating
a struct file, stuffing it in the file table (which is a layer
violation), stuffing the file descriptor number for it in a magic
field of struct lwp (which is gross), and then "failing" with one of
two magic errnos, EDUPFD or EMOVEFD.

Before this commit, all callers of vn_open in the kernel (there are
quite a few) were expected to check for these errors and handle the
situation. Needless to say, none of them except for open() itself did,
resulting in internal negative errnos being returned to userspace.

This hack is fairly deeply rooted and cannot be eliminated all at
once. This commit adds logic to handle the magic errnos inside
vn_open; now on success vn_open returns either a vnode or an integer
file descriptor, along with a flag that says whether the underlying
code requested EDUPFD or EMOVEFD. Callers not prepared to cope with
file descriptors can pass NULL for the extra return values, in which
case if a file descriptor would be produced vn_open fails with
EOPNOTSUPP.

Since I'm rearranging vn_open's signature anyway, stop exposing struct
nameidata. Instead, take three arguments: an optional vnode to use as
the starting point (like openat()), the path, and additional namei
flags to use, restricted to NOCHROOT and TRYEMULROOT. (Other namei
behavior, e.g. NOFOLLOW, can be requested via the open flags.)

This change requires a kernel bump. Ride the one an hour ago.
(That was supposed to be coordinated; did not intend to let an hour
slip by. My fault.)


# 1.281 13-Jun-2021 mlelstv

Fail to open read-write when created read-only.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base
# 1.280 11-Apr-2021 mlelstv

branches: 1.280.2;
Provide a default preferred I/O size.


# 1.279 11-Apr-2021 mlelstv

Don't truncate disk size to full cylinders.


Revision tags: thorpej-futex-base
# 1.278 04-Jan-2021 mlelstv

branches: 1.278.2;
Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

branches: 1.277.2;
pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.4; 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.288 14-Mar-2023 hannken

Do not limit the number of pending requests for the worker thread.

With wedge on vnd it prevents a deadlock when requests get queued with
biodone() -> dkstart() -> vndstrategy().

Fixes PR kern/57263 "vnd locks up when using vn_rdwr"


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.287 04-Sep-2022 mlelstv

branches: 1.287.4;
revert 1.281
VNDIOCLR requires write access to unconfigure a unit, even when the unit
is read-only.


# 1.286 31-May-2022 riastradh

vnd(4): Work around deadlock in VNDIOCCLR.

Since the changes this year to eliminate a host of races and
deadlocks in open, close, revoke, attach, and detach, closing the
last instance of a device special node has the side effect of waiting
for all concurrent I/O operations (read, write, ioctl, strategy, &c.)
on the device to complete.

Unfortunately, while this works for physical devices which revoke
open device nodes in their autoconf detach functions, as invoked by
some hardware interrupt indicating that the device is no longer
present, pseudo-devices like vnd(4) work differently -- or, work by
luck, or don't work any more.

VNDIOCCLR acts kind of like an autoconf detach function in that it
revokes open device nodes, which closes the last instance. But
VNDIOCCLR is itself called via ioctl, which is an I/O operation that
close waits for. So we end up with a deadlock, spec_io_drain waiting
for spec_close lower down in the call stack:

> spec_io_drain() at netbsd:spec_io_drain+0x84
> spec_close() at netbsd:spec_close+0x1c6
> VOP_CLOSE() at netbsd:VOP_CLOSE+0x38
> spec_node_revoke() at netbsd:spec_node_revoke+0x14d
> vcache_reclaim() at netbsd:vcache_reclaim+0x4e7
> vgone() at netbsd:vgone+0xcd
> vrevoke() at netbsd:vrevoke+0xfa
> genfs_revoke() at netbsd:genfs_revoke+0x13
> VOP_REVOKE() at netbsd:VOP_REVOKE+0x35
> vdevgone() at netbsd:vdevgone+0x64
> vnddoclear.part.0() at netbsd:vnddoclear.part.0+0xaa
> vndioctl() at netbsd:vndioctl+0x78c
> bdev_ioctl() at netbsd:bdev_ioctl+0x91
> spec_ioctl() at netbsd:spec_ioctl+0xa5
> VOP_IOCTL() at netbsd:VOP_IOCTL+0x41
> vn_ioctl() at netbsd:vn_ioctl+0xb3
> sys_ioctl() at netbsd:sys_ioctl+0x555

In the past, there was a workaround for what was presumably a crash
instead of a deadlock here: don't issue revoke (vdevgone) on the open
character devices for the minor number in use by the ioctl. If you
use, e.g., `vnconfig -u vnd0', and vnconfig(8) picks /dev/rvnd0c or
/dev/rvnd0d, that special case kicks in. But if you use `vnconfig -u
/dev/vnd0d', the ioctl will be issued on the block device instead, so
the special case doesn't kick in, so the operation deadlocks.

It is actually probably safe not to revoke the block device if what
the ioctl caller holds open is that, because specfs(9) forbids more
than one open of a block device, so nothing else can have it open
anyway.

Unclear what the consequences of failing to revoke the character
device are -- but this is what vnd(4) has done all along. cgd(4) and
ccd(4) also don't bother to revoke. We don't have a notion of
`revoke every file descriptor _except_ this one'; only a vnode as a
whole can be revoked, including all references to it.

This is a stop-gap measure to avoid a deadlock we are definitely
hitting on some users. A slightly better measure would be to revoke
the block or character device according to which one is being used,
but that requires a little more work with two different d_ioctl
functions -- and wouldn't address isues with the character device. A
proper solution requires identifying the appropriate protocol for all
of these pseudo-device disk drivers and using it uniformly for them.

Reported on current-users:
https://mail-index.netbsd.org/current-users/2022/05/27/msg042437.html


# 1.285 31-Mar-2022 pgoyette

For device modules that provide both auto-config and /dev/xxx
interfaces, make sure that initialization and destruction
follow the proper sequence. This is triggered by the recent
changes to the devsw stuff; per riastradh@ the required call
sequence is:

devsw_attach()
config_init_component() or config_cf*_attach()
...
config_fini_component() or config_cf*_detach()
devsw_detach()

While here, add a few missing calls to some of the detach
routines.

Testing of these changes has been limited to:
1. compile without build break
2. no related test failures from atf
3. modload/modunload work as well as
before.

No functional device testing done, since I don't have any
of these devices. Let me know of any damage I might cause
here!

XXX Some of the modules affected by this commit are already
XXX broken; see kern/56772. This commit does not break
any additional modules (as far as I know).


# 1.284 28-Mar-2022 mlelstv

Check INITED state by default for all ioctls but VNDIOCSET. Avoids crashes
with disk_ioctls on default unit, which is not INITED.
Fixes PR 56700.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.283 24-Jul-2021 andvar

Fix all remaining typos, mainly in comments but also in few definitions and log messages, reported by me in PR kern/54889.
Also fixed some additional typos in comments, found on review of same files or typos.


# 1.282 29-Jun-2021 dholland

Add containment for the cloning devices hack in vn_open.

Cloning devices (and also things like /dev/stderr) work by allocating
a struct file, stuffing it in the file table (which is a layer
violation), stuffing the file descriptor number for it in a magic
field of struct lwp (which is gross), and then "failing" with one of
two magic errnos, EDUPFD or EMOVEFD.

Before this commit, all callers of vn_open in the kernel (there are
quite a few) were expected to check for these errors and handle the
situation. Needless to say, none of them except for open() itself did,
resulting in internal negative errnos being returned to userspace.

This hack is fairly deeply rooted and cannot be eliminated all at
once. This commit adds logic to handle the magic errnos inside
vn_open; now on success vn_open returns either a vnode or an integer
file descriptor, along with a flag that says whether the underlying
code requested EDUPFD or EMOVEFD. Callers not prepared to cope with
file descriptors can pass NULL for the extra return values, in which
case if a file descriptor would be produced vn_open fails with
EOPNOTSUPP.

Since I'm rearranging vn_open's signature anyway, stop exposing struct
nameidata. Instead, take three arguments: an optional vnode to use as
the starting point (like openat()), the path, and additional namei
flags to use, restricted to NOCHROOT and TRYEMULROOT. (Other namei
behavior, e.g. NOFOLLOW, can be requested via the open flags.)

This change requires a kernel bump. Ride the one an hour ago.
(That was supposed to be coordinated; did not intend to let an hour
slip by. My fault.)


# 1.281 13-Jun-2021 mlelstv

Fail to open read-write when created read-only.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base
# 1.280 11-Apr-2021 mlelstv

branches: 1.280.2;
Provide a default preferred I/O size.


# 1.279 11-Apr-2021 mlelstv

Don't truncate disk size to full cylinders.


Revision tags: thorpej-futex-base
# 1.278 04-Jan-2021 mlelstv

branches: 1.278.2;
Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

branches: 1.277.2;
pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.4; 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.287 04-Sep-2022 mlelstv

revert 1.281
VNDIOCLR requires write access to unconfigure a unit, even when the unit
is read-only.


# 1.286 31-May-2022 riastradh

vnd(4): Work around deadlock in VNDIOCCLR.

Since the changes this year to eliminate a host of races and
deadlocks in open, close, revoke, attach, and detach, closing the
last instance of a device special node has the side effect of waiting
for all concurrent I/O operations (read, write, ioctl, strategy, &c.)
on the device to complete.

Unfortunately, while this works for physical devices which revoke
open device nodes in their autoconf detach functions, as invoked by
some hardware interrupt indicating that the device is no longer
present, pseudo-devices like vnd(4) work differently -- or, work by
luck, or don't work any more.

VNDIOCCLR acts kind of like an autoconf detach function in that it
revokes open device nodes, which closes the last instance. But
VNDIOCCLR is itself called via ioctl, which is an I/O operation that
close waits for. So we end up with a deadlock, spec_io_drain waiting
for spec_close lower down in the call stack:

> spec_io_drain() at netbsd:spec_io_drain+0x84
> spec_close() at netbsd:spec_close+0x1c6
> VOP_CLOSE() at netbsd:VOP_CLOSE+0x38
> spec_node_revoke() at netbsd:spec_node_revoke+0x14d
> vcache_reclaim() at netbsd:vcache_reclaim+0x4e7
> vgone() at netbsd:vgone+0xcd
> vrevoke() at netbsd:vrevoke+0xfa
> genfs_revoke() at netbsd:genfs_revoke+0x13
> VOP_REVOKE() at netbsd:VOP_REVOKE+0x35
> vdevgone() at netbsd:vdevgone+0x64
> vnddoclear.part.0() at netbsd:vnddoclear.part.0+0xaa
> vndioctl() at netbsd:vndioctl+0x78c
> bdev_ioctl() at netbsd:bdev_ioctl+0x91
> spec_ioctl() at netbsd:spec_ioctl+0xa5
> VOP_IOCTL() at netbsd:VOP_IOCTL+0x41
> vn_ioctl() at netbsd:vn_ioctl+0xb3
> sys_ioctl() at netbsd:sys_ioctl+0x555

In the past, there was a workaround for what was presumably a crash
instead of a deadlock here: don't issue revoke (vdevgone) on the open
character devices for the minor number in use by the ioctl. If you
use, e.g., `vnconfig -u vnd0', and vnconfig(8) picks /dev/rvnd0c or
/dev/rvnd0d, that special case kicks in. But if you use `vnconfig -u
/dev/vnd0d', the ioctl will be issued on the block device instead, so
the special case doesn't kick in, so the operation deadlocks.

It is actually probably safe not to revoke the block device if what
the ioctl caller holds open is that, because specfs(9) forbids more
than one open of a block device, so nothing else can have it open
anyway.

Unclear what the consequences of failing to revoke the character
device are -- but this is what vnd(4) has done all along. cgd(4) and
ccd(4) also don't bother to revoke. We don't have a notion of
`revoke every file descriptor _except_ this one'; only a vnode as a
whole can be revoked, including all references to it.

This is a stop-gap measure to avoid a deadlock we are definitely
hitting on some users. A slightly better measure would be to revoke
the block or character device according to which one is being used,
but that requires a little more work with two different d_ioctl
functions -- and wouldn't address isues with the character device. A
proper solution requires identifying the appropriate protocol for all
of these pseudo-device disk drivers and using it uniformly for them.

Reported on current-users:
https://mail-index.netbsd.org/current-users/2022/05/27/msg042437.html


# 1.285 31-Mar-2022 pgoyette

For device modules that provide both auto-config and /dev/xxx
interfaces, make sure that initialization and destruction
follow the proper sequence. This is triggered by the recent
changes to the devsw stuff; per riastradh@ the required call
sequence is:

devsw_attach()
config_init_component() or config_cf*_attach()
...
config_fini_component() or config_cf*_detach()
devsw_detach()

While here, add a few missing calls to some of the detach
routines.

Testing of these changes has been limited to:
1. compile without build break
2. no related test failures from atf
3. modload/modunload work as well as
before.

No functional device testing done, since I don't have any
of these devices. Let me know of any damage I might cause
here!

XXX Some of the modules affected by this commit are already
XXX broken; see kern/56772. This commit does not break
any additional modules (as far as I know).


# 1.284 28-Mar-2022 mlelstv

Check INITED state by default for all ioctls but VNDIOCSET. Avoids crashes
with disk_ioctls on default unit, which is not INITED.
Fixes PR 56700.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.283 24-Jul-2021 andvar

Fix all remaining typos, mainly in comments but also in few definitions and log messages, reported by me in PR kern/54889.
Also fixed some additional typos in comments, found on review of same files or typos.


# 1.282 29-Jun-2021 dholland

Add containment for the cloning devices hack in vn_open.

Cloning devices (and also things like /dev/stderr) work by allocating
a struct file, stuffing it in the file table (which is a layer
violation), stuffing the file descriptor number for it in a magic
field of struct lwp (which is gross), and then "failing" with one of
two magic errnos, EDUPFD or EMOVEFD.

Before this commit, all callers of vn_open in the kernel (there are
quite a few) were expected to check for these errors and handle the
situation. Needless to say, none of them except for open() itself did,
resulting in internal negative errnos being returned to userspace.

This hack is fairly deeply rooted and cannot be eliminated all at
once. This commit adds logic to handle the magic errnos inside
vn_open; now on success vn_open returns either a vnode or an integer
file descriptor, along with a flag that says whether the underlying
code requested EDUPFD or EMOVEFD. Callers not prepared to cope with
file descriptors can pass NULL for the extra return values, in which
case if a file descriptor would be produced vn_open fails with
EOPNOTSUPP.

Since I'm rearranging vn_open's signature anyway, stop exposing struct
nameidata. Instead, take three arguments: an optional vnode to use as
the starting point (like openat()), the path, and additional namei
flags to use, restricted to NOCHROOT and TRYEMULROOT. (Other namei
behavior, e.g. NOFOLLOW, can be requested via the open flags.)

This change requires a kernel bump. Ride the one an hour ago.
(That was supposed to be coordinated; did not intend to let an hour
slip by. My fault.)


# 1.281 13-Jun-2021 mlelstv

Fail to open read-write when created read-only.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base
# 1.280 11-Apr-2021 mlelstv

branches: 1.280.2;
Provide a default preferred I/O size.


# 1.279 11-Apr-2021 mlelstv

Don't truncate disk size to full cylinders.


Revision tags: thorpej-futex-base
# 1.278 04-Jan-2021 mlelstv

branches: 1.278.2;
Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

branches: 1.277.2;
pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-3-RELEASE netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.286 31-May-2022 riastradh

vnd(4): Work around deadlock in VNDIOCCLR.

Since the changes this year to eliminate a host of races and
deadlocks in open, close, revoke, attach, and detach, closing the
last instance of a device special node has the side effect of waiting
for all concurrent I/O operations (read, write, ioctl, strategy, &c.)
on the device to complete.

Unfortunately, while this works for physical devices which revoke
open device nodes in their autoconf detach functions, as invoked by
some hardware interrupt indicating that the device is no longer
present, pseudo-devices like vnd(4) work differently -- or, work by
luck, or don't work any more.

VNDIOCCLR acts kind of like an autoconf detach function in that it
revokes open device nodes, which closes the last instance. But
VNDIOCCLR is itself called via ioctl, which is an I/O operation that
close waits for. So we end up with a deadlock, spec_io_drain waiting
for spec_close lower down in the call stack:

> spec_io_drain() at netbsd:spec_io_drain+0x84
> spec_close() at netbsd:spec_close+0x1c6
> VOP_CLOSE() at netbsd:VOP_CLOSE+0x38
> spec_node_revoke() at netbsd:spec_node_revoke+0x14d
> vcache_reclaim() at netbsd:vcache_reclaim+0x4e7
> vgone() at netbsd:vgone+0xcd
> vrevoke() at netbsd:vrevoke+0xfa
> genfs_revoke() at netbsd:genfs_revoke+0x13
> VOP_REVOKE() at netbsd:VOP_REVOKE+0x35
> vdevgone() at netbsd:vdevgone+0x64
> vnddoclear.part.0() at netbsd:vnddoclear.part.0+0xaa
> vndioctl() at netbsd:vndioctl+0x78c
> bdev_ioctl() at netbsd:bdev_ioctl+0x91
> spec_ioctl() at netbsd:spec_ioctl+0xa5
> VOP_IOCTL() at netbsd:VOP_IOCTL+0x41
> vn_ioctl() at netbsd:vn_ioctl+0xb3
> sys_ioctl() at netbsd:sys_ioctl+0x555

In the past, there was a workaround for what was presumably a crash
instead of a deadlock here: don't issue revoke (vdevgone) on the open
character devices for the minor number in use by the ioctl. If you
use, e.g., `vnconfig -u vnd0', and vnconfig(8) picks /dev/rvnd0c or
/dev/rvnd0d, that special case kicks in. But if you use `vnconfig -u
/dev/vnd0d', the ioctl will be issued on the block device instead, so
the special case doesn't kick in, so the operation deadlocks.

It is actually probably safe not to revoke the block device if what
the ioctl caller holds open is that, because specfs(9) forbids more
than one open of a block device, so nothing else can have it open
anyway.

Unclear what the consequences of failing to revoke the character
device are -- but this is what vnd(4) has done all along. cgd(4) and
ccd(4) also don't bother to revoke. We don't have a notion of
`revoke every file descriptor _except_ this one'; only a vnode as a
whole can be revoked, including all references to it.

This is a stop-gap measure to avoid a deadlock we are definitely
hitting on some users. A slightly better measure would be to revoke
the block or character device according to which one is being used,
but that requires a little more work with two different d_ioctl
functions -- and wouldn't address isues with the character device. A
proper solution requires identifying the appropriate protocol for all
of these pseudo-device disk drivers and using it uniformly for them.

Reported on current-users:
https://mail-index.netbsd.org/current-users/2022/05/27/msg042437.html


# 1.285 31-Mar-2022 pgoyette

For device modules that provide both auto-config and /dev/xxx
interfaces, make sure that initialization and destruction
follow the proper sequence. This is triggered by the recent
changes to the devsw stuff; per riastradh@ the required call
sequence is:

devsw_attach()
config_init_component() or config_cf*_attach()
...
config_fini_component() or config_cf*_detach()
devsw_detach()

While here, add a few missing calls to some of the detach
routines.

Testing of these changes has been limited to:
1. compile without build break
2. no related test failures from atf
3. modload/modunload work as well as
before.

No functional device testing done, since I don't have any
of these devices. Let me know of any damage I might cause
here!

XXX Some of the modules affected by this commit are already
XXX broken; see kern/56772. This commit does not break
any additional modules (as far as I know).


# 1.284 28-Mar-2022 mlelstv

Check INITED state by default for all ioctls but VNDIOCSET. Avoids crashes
with disk_ioctls on default unit, which is not INITED.
Fixes PR 56700.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.283 24-Jul-2021 andvar

Fix all remaining typos, mainly in comments but also in few definitions and log messages, reported by me in PR kern/54889.
Also fixed some additional typos in comments, found on review of same files or typos.


# 1.282 29-Jun-2021 dholland

Add containment for the cloning devices hack in vn_open.

Cloning devices (and also things like /dev/stderr) work by allocating
a struct file, stuffing it in the file table (which is a layer
violation), stuffing the file descriptor number for it in a magic
field of struct lwp (which is gross), and then "failing" with one of
two magic errnos, EDUPFD or EMOVEFD.

Before this commit, all callers of vn_open in the kernel (there are
quite a few) were expected to check for these errors and handle the
situation. Needless to say, none of them except for open() itself did,
resulting in internal negative errnos being returned to userspace.

This hack is fairly deeply rooted and cannot be eliminated all at
once. This commit adds logic to handle the magic errnos inside
vn_open; now on success vn_open returns either a vnode or an integer
file descriptor, along with a flag that says whether the underlying
code requested EDUPFD or EMOVEFD. Callers not prepared to cope with
file descriptors can pass NULL for the extra return values, in which
case if a file descriptor would be produced vn_open fails with
EOPNOTSUPP.

Since I'm rearranging vn_open's signature anyway, stop exposing struct
nameidata. Instead, take three arguments: an optional vnode to use as
the starting point (like openat()), the path, and additional namei
flags to use, restricted to NOCHROOT and TRYEMULROOT. (Other namei
behavior, e.g. NOFOLLOW, can be requested via the open flags.)

This change requires a kernel bump. Ride the one an hour ago.
(That was supposed to be coordinated; did not intend to let an hour
slip by. My fault.)


# 1.281 13-Jun-2021 mlelstv

Fail to open read-write when created read-only.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base
# 1.280 11-Apr-2021 mlelstv

branches: 1.280.2;
Provide a default preferred I/O size.


# 1.279 11-Apr-2021 mlelstv

Don't truncate disk size to full cylinders.


Revision tags: thorpej-futex-base
# 1.278 04-Jan-2021 mlelstv

branches: 1.278.2;
Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

branches: 1.277.2;
pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.285 31-Mar-2022 pgoyette

For device modules that provide both auto-config and /dev/xxx
interfaces, make sure that initialization and destruction
follow the proper sequence. This is triggered by the recent
changes to the devsw stuff; per riastradh@ the required call
sequence is:

devsw_attach()
config_init_component() or config_cf*_attach()
...
config_fini_component() or config_cf*_detach()
devsw_detach()

While here, add a few missing calls to some of the detach
routines.

Testing of these changes has been limited to:
1. compile without build break
2. no related test failures from atf
3. modload/modunload work as well as
before.

No functional device testing done, since I don't have any
of these devices. Let me know of any damage I might cause
here!

XXX Some of the modules affected by this commit are already
XXX broken; see kern/56772. This commit does not break
any additional modules (as far as I know).


# 1.284 28-Mar-2022 mlelstv

Check INITED state by default for all ioctls but VNDIOCSET. Avoids crashes
with disk_ioctls on default unit, which is not INITED.
Fixes PR 56700.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.283 24-Jul-2021 andvar

Fix all remaining typos, mainly in comments but also in few definitions and log messages, reported by me in PR kern/54889.
Also fixed some additional typos in comments, found on review of same files or typos.


# 1.282 29-Jun-2021 dholland

Add containment for the cloning devices hack in vn_open.

Cloning devices (and also things like /dev/stderr) work by allocating
a struct file, stuffing it in the file table (which is a layer
violation), stuffing the file descriptor number for it in a magic
field of struct lwp (which is gross), and then "failing" with one of
two magic errnos, EDUPFD or EMOVEFD.

Before this commit, all callers of vn_open in the kernel (there are
quite a few) were expected to check for these errors and handle the
situation. Needless to say, none of them except for open() itself did,
resulting in internal negative errnos being returned to userspace.

This hack is fairly deeply rooted and cannot be eliminated all at
once. This commit adds logic to handle the magic errnos inside
vn_open; now on success vn_open returns either a vnode or an integer
file descriptor, along with a flag that says whether the underlying
code requested EDUPFD or EMOVEFD. Callers not prepared to cope with
file descriptors can pass NULL for the extra return values, in which
case if a file descriptor would be produced vn_open fails with
EOPNOTSUPP.

Since I'm rearranging vn_open's signature anyway, stop exposing struct
nameidata. Instead, take three arguments: an optional vnode to use as
the starting point (like openat()), the path, and additional namei
flags to use, restricted to NOCHROOT and TRYEMULROOT. (Other namei
behavior, e.g. NOFOLLOW, can be requested via the open flags.)

This change requires a kernel bump. Ride the one an hour ago.
(That was supposed to be coordinated; did not intend to let an hour
slip by. My fault.)


# 1.281 13-Jun-2021 mlelstv

Fail to open read-write when created read-only.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base
# 1.280 11-Apr-2021 mlelstv

branches: 1.280.2;
Provide a default preferred I/O size.


# 1.279 11-Apr-2021 mlelstv

Don't truncate disk size to full cylinders.


Revision tags: thorpej-futex-base
# 1.278 04-Jan-2021 mlelstv

branches: 1.278.2;
Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

branches: 1.277.2;
pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.284 28-Mar-2022 mlelstv

Check INITED state by default for all ioctls but VNDIOCSET. Avoids crashes
with disk_ioctls on default unit, which is not INITED.
Fixes PR 56700.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
# 1.283 24-Jul-2021 andvar

Fix all remaining typos, mainly in comments but also in few definitions and log messages, reported by me in PR kern/54889.
Also fixed some additional typos in comments, found on review of same files or typos.


# 1.282 29-Jun-2021 dholland

Add containment for the cloning devices hack in vn_open.

Cloning devices (and also things like /dev/stderr) work by allocating
a struct file, stuffing it in the file table (which is a layer
violation), stuffing the file descriptor number for it in a magic
field of struct lwp (which is gross), and then "failing" with one of
two magic errnos, EDUPFD or EMOVEFD.

Before this commit, all callers of vn_open in the kernel (there are
quite a few) were expected to check for these errors and handle the
situation. Needless to say, none of them except for open() itself did,
resulting in internal negative errnos being returned to userspace.

This hack is fairly deeply rooted and cannot be eliminated all at
once. This commit adds logic to handle the magic errnos inside
vn_open; now on success vn_open returns either a vnode or an integer
file descriptor, along with a flag that says whether the underlying
code requested EDUPFD or EMOVEFD. Callers not prepared to cope with
file descriptors can pass NULL for the extra return values, in which
case if a file descriptor would be produced vn_open fails with
EOPNOTSUPP.

Since I'm rearranging vn_open's signature anyway, stop exposing struct
nameidata. Instead, take three arguments: an optional vnode to use as
the starting point (like openat()), the path, and additional namei
flags to use, restricted to NOCHROOT and TRYEMULROOT. (Other namei
behavior, e.g. NOFOLLOW, can be requested via the open flags.)

This change requires a kernel bump. Ride the one an hour ago.
(That was supposed to be coordinated; did not intend to let an hour
slip by. My fault.)


# 1.281 13-Jun-2021 mlelstv

Fail to open read-write when created read-only.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base
# 1.280 11-Apr-2021 mlelstv

branches: 1.280.2;
Provide a default preferred I/O size.


# 1.279 11-Apr-2021 mlelstv

Don't truncate disk size to full cylinders.


Revision tags: thorpej-futex-base
# 1.278 04-Jan-2021 mlelstv

branches: 1.278.2;
Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

branches: 1.277.2;
pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.283 24-Jul-2021 andvar

Fix all remaining typos, mainly in comments but also in few definitions and log messages, reported by me in PR kern/54889.
Also fixed some additional typos in comments, found on review of same files or typos.


# 1.282 29-Jun-2021 dholland

Add containment for the cloning devices hack in vn_open.

Cloning devices (and also things like /dev/stderr) work by allocating
a struct file, stuffing it in the file table (which is a layer
violation), stuffing the file descriptor number for it in a magic
field of struct lwp (which is gross), and then "failing" with one of
two magic errnos, EDUPFD or EMOVEFD.

Before this commit, all callers of vn_open in the kernel (there are
quite a few) were expected to check for these errors and handle the
situation. Needless to say, none of them except for open() itself did,
resulting in internal negative errnos being returned to userspace.

This hack is fairly deeply rooted and cannot be eliminated all at
once. This commit adds logic to handle the magic errnos inside
vn_open; now on success vn_open returns either a vnode or an integer
file descriptor, along with a flag that says whether the underlying
code requested EDUPFD or EMOVEFD. Callers not prepared to cope with
file descriptors can pass NULL for the extra return values, in which
case if a file descriptor would be produced vn_open fails with
EOPNOTSUPP.

Since I'm rearranging vn_open's signature anyway, stop exposing struct
nameidata. Instead, take three arguments: an optional vnode to use as
the starting point (like openat()), the path, and additional namei
flags to use, restricted to NOCHROOT and TRYEMULROOT. (Other namei
behavior, e.g. NOFOLLOW, can be requested via the open flags.)

This change requires a kernel bump. Ride the one an hour ago.
(That was supposed to be coordinated; did not intend to let an hour
slip by. My fault.)


Revision tags: thorpej-i2c-spi-conf-base
# 1.281 13-Jun-2021 mlelstv

Fail to open read-write when created read-only.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base
# 1.280 11-Apr-2021 mlelstv

branches: 1.280.2;
Provide a default preferred I/O size.


# 1.279 11-Apr-2021 mlelstv

Don't truncate disk size to full cylinders.


Revision tags: thorpej-futex-base
# 1.278 04-Jan-2021 mlelstv

branches: 1.278.2;
Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

branches: 1.277.2;
pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.282 29-Jun-2021 dholland

Add containment for the cloning devices hack in vn_open.

Cloning devices (and also things like /dev/stderr) work by allocating
a struct file, stuffing it in the file table (which is a layer
violation), stuffing the file descriptor number for it in a magic
field of struct lwp (which is gross), and then "failing" with one of
two magic errnos, EDUPFD or EMOVEFD.

Before this commit, all callers of vn_open in the kernel (there are
quite a few) were expected to check for these errors and handle the
situation. Needless to say, none of them except for open() itself did,
resulting in internal negative errnos being returned to userspace.

This hack is fairly deeply rooted and cannot be eliminated all at
once. This commit adds logic to handle the magic errnos inside
vn_open; now on success vn_open returns either a vnode or an integer
file descriptor, along with a flag that says whether the underlying
code requested EDUPFD or EMOVEFD. Callers not prepared to cope with
file descriptors can pass NULL for the extra return values, in which
case if a file descriptor would be produced vn_open fails with
EOPNOTSUPP.

Since I'm rearranging vn_open's signature anyway, stop exposing struct
nameidata. Instead, take three arguments: an optional vnode to use as
the starting point (like openat()), the path, and additional namei
flags to use, restricted to NOCHROOT and TRYEMULROOT. (Other namei
behavior, e.g. NOFOLLOW, can be requested via the open flags.)

This change requires a kernel bump. Ride the one an hour ago.
(That was supposed to be coordinated; did not intend to let an hour
slip by. My fault.)


Revision tags: thorpej-i2c-spi-conf-base
# 1.281 13-Jun-2021 mlelstv

Fail to open read-write when created read-only.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-cfargs-base
# 1.280 11-Apr-2021 mlelstv

branches: 1.280.2;
Provide a default preferred I/O size.


# 1.279 11-Apr-2021 mlelstv

Don't truncate disk size to full cylinders.


Revision tags: thorpej-futex-base
# 1.278 04-Jan-2021 mlelstv

branches: 1.278.2;
Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

branches: 1.277.2;
pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.281 13-Jun-2021 mlelstv

Fail to open read-write when created read-only.


Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base
# 1.280 11-Apr-2021 mlelstv

Provide a default preferred I/O size.


# 1.279 11-Apr-2021 mlelstv

Don't truncate disk size to full cylinders.


Revision tags: thorpej-futex-base
# 1.278 04-Jan-2021 mlelstv

branches: 1.278.2;
Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

branches: 1.277.2;
pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-2-RELEASE netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.280 11-Apr-2021 mlelstv

Provide a default preferred I/O size.


# 1.279 11-Apr-2021 mlelstv

Don't truncate disk size to full cylinders.


Revision tags: thorpej-cfargs-base thorpej-futex-base
# 1.278 04-Jan-2021 mlelstv

Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

branches: 1.277.2;
pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.278 04-Jan-2021 mlelstv

Fix calculation of cylinder count from medium size.
Pullups needed.


Revision tags: thorpej-futex-base bouyer-xenpvh-base2
# 1.277 23-Apr-2020 jdolecek

pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.277 23-Apr-2020 jdolecek

pass b_flags B_PHYS and B_RAW when setting up the buf for underlying device

should fix misfired KASSERT() in xbd(4)


Revision tags: phil-wifi-20200421 bouyer-xenpvh-base1
# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

branches: 1.274.4;
UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.276 13-Apr-2020 maxv

constify


Revision tags: phil-wifi-20200411
# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.275 10-Apr-2020 jdolecek

add support for DIOCGSTRATEGY and DIOCGCACHE

only allow DIOCCACHESYNC if open for writing, same as everything supporting
DIOCCACHESYNC


Revision tags: bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.274 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.274 23-Feb-2020 ad

UVM locking changes, proposed on tech-kern:

- Change the lock on uvm_object, vm_amap and vm_anon to be a RW lock.
- Break v_interlock and vmobjlock apart. v_interlock remains a mutex.
- Do partial PV list locking in the x86 pmap. Others to follow later.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


Revision tags: ad-namecache-base1
# 1.273 17-Jan-2020 ad

Acquire kernel_lock in the bp->b_iodone callback.


Revision tags: ad-namecache-base netbsd-9-0-RC1 phil-wifi-20191119 netbsd-9-base phil-wifi-20190609 isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

branches: 1.272.6;
Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2; 1.263.4;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


Revision tags: isaki-audio2-base
# 1.272 01-Mar-2019 pgoyette

Rename the MODULE_*_HOOK() macros to MODULE_HOOK_*() as briefly
discussed on irc.

NFCI intended.

Ride the earlier kernel bump - it;s getting crowded.


# 1.271 27-Jan-2019 pgoyette

Merge the [pgoyette-compat] branch


Revision tags: pgoyette-compat-20190127 pgoyette-compat-20190118 pgoyette-compat-1226
# 1.270 10-Dec-2018 hannken

Operation handle_with_strategy() also needs the
fstrans_start_lazy() / fstrans_done() bracket.

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-1126 pgoyette-compat-1020
# 1.269 07-Oct-2018 mlelstv

Use IO_DIRECT for file I/O to reduce buffer cache contention.

Restore old behaviour to flush pages only when usage exceeds 1MB.

No longer use PGO_SYNCIO, regular writes to the device do not require
the data to reach stable storage, the DIOCCACHESYNC ioctl is used
for that.


# 1.268 07-Oct-2018 mlelstv

Calculate a missing cylinder count in the geometry spec from image size.


# 1.267 07-Oct-2018 mlelstv

Add flag to enforce file I/O even when bmap/strategy would be possible.
This makes it easier to compare both modes, it also allows coherent
operation between vnd device and image file.


# 1.266 05-Oct-2018 hannken

Bring back three state file system suspension:

NORMAL -> SUSPENDING -> SUSPENDED

and add operation fstrans_start_lazy() that only blocks while SUSPENDED.

Change vndthread() support operation handle_with_rdwr() to bracket
its file system operations by fstrans_start_lazy() and fstrans_done().

PR kern/53624 (dom0 freeze on domU exit)


Revision tags: pgoyette-compat-0930
# 1.265 20-Sep-2018 mlelstv

getdisksize only operates on device vnodes. Use the ioctl on the underlying
device instead.


Revision tags: pgoyette-compat-0906
# 1.264 03-Sep-2018 riastradh

Rename min/max -> uimin/uimax for better honesty.

These functions are defined on unsigned int. The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER! Some subsystems have

#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX. Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate. But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all. (Who knows, maybe in some cases integer
truncation is actually intended!)


Revision tags: jdolecek-ncqfixes-base pgoyette-compat-0728 phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422 pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base tls-maxphys-base-20171202
# 1.263 28-Oct-2017 riastradh

branches: 1.263.2;
Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.263 28-Oct-2017 riastradh

Kill some more extern cfdriver xyz_cd in favour of #include "ioconf.h".


Revision tags: nick-nhusb-base-20170825
# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4; 1.259.6;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


# 1.262 28-Jul-2017 riastradh

Appease toxic bullshit warning from gcc.

If you have a better way to write a useful bounds check that happens
to always pass on LP64 but doesn't always on LP32, without making it
fail to compile on LP64 or making it an #ifdef conditional on LP32,
please put it in here instead.


# 1.261 28-Jul-2017 riastradh

Fix indentation. u_intN_t -> uintN_t. ntohl -> be32toh.

No functional change intended.


# 1.260 28-Jul-2017 riastradh

Put in a litany of judicious bounds checks around vnd headers.

Thought I was done with this crap after I rewrote vndcompress(1)!

From Ilja Van Sprundel.


Revision tags: perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


Revision tags: prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base
# 1.259 25-Mar-2017 pgoyette

branches: 1.259.4;
Don't display step-by-step detailed error messages unless DIAGNOSTIC.

Among other things, this avoids expected "error messages" when the
module is being auto-unloaded while one or more units are still in
use.


Revision tags: pgoyette-localcount-20170320 nick-nhusb-base-20170204 bouyer-socketcan-base pgoyette-localcount-20170107 nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

branches: 1.258.2;
Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)


Revision tags: nick-nhusb-base-20161204 pgoyette-localcount-20161104 nick-nhusb-base-20161004 localcount-20160914 pgoyette-localcount-20160806
# 1.258 05-Aug-2016 pgoyette

Ignore return values when backing out of a "finish" sequence. There
really shouldn't be any errors here (we're just putting something back
that previously existed), and a panic() would be rather drastic.


Revision tags: pgoyette-localcount-20160726
# 1.257 26-Jul-2016 pgoyette

When calling devsw_attach() we need to use the expected/official driver
name (as listed in the devsw_conv[] table) to get the expected device
majors. Once rump initialization is finished (ie, it has created its
required device nodes), we need to detach the [bc]devsw so the module
initialization code doesn't get EEXIST.


Revision tags: pgoyette-localcount-base nick-nhusb-base-20160907 nick-nhusb-base-20160529 nick-nhusb-base-20160422 nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.256 08-Dec-2015 christos

branches: 1.256.2;
Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.255 30-Nov-2015 mlelstv

Fall back to VOP_READ/VOP_WRITE if the simulated disk has smaller
sectors than the underlying filesystem and VOP_STRATEGY would fail.


# 1.254 12-Nov-2015 christos

Use the new DK_DEV_BSIZE_OK() macro.


# 1.253 12-Nov-2015 christos

fix incorrect memset.


# 1.252 09-Nov-2015 christos

explain why the int cast works (suggested by kre)


# 1.251 09-Nov-2015 christos

Return ENXIO if the get ioctl exceeds the number of configured devices.
XXX: pullup-7


# 1.250 09-Nov-2015 christos

disable debugging


# 1.249 09-Nov-2015 christos

Simplify ioctl handling a little.


Revision tags: nick-nhusb-base-20150921
# 1.248 20-Aug-2015 christos

include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.


# 1.247 02-Aug-2015 mlelstv

use dk_openlock when accessing openmask.


# 1.246 28-Jul-2015 prlw1

Print vndattach error


Revision tags: nick-nhusb-base-20150606
# 1.245 25-May-2015 prlw1

typo


# 1.244 25-May-2015 prlw1

whitespace police


# 1.243 26-Apr-2015 mlelstv

Use C99-style initializers for struct dkdriver.


# 1.242 06-Apr-2015 mlelstv

Make DIOCKLABEL work. Set default to keep the disklabel after close to
not change current behaviour.


Revision tags: nick-nhusb-base-20150406
# 1.241 28-Jan-2015 bouyer

Fix typo in comment


# 1.240 28-Jan-2015 bouyer

As discussed in
http://mail-index.netbsd.org/tech-kern/2015/01/24/msg018339.html
don't bump v_numoutput if we need to vn_lock() the vnode before queuing
the corresponding I/O, because this may deadlock with genfs_do_putpages()
when called with the vnode locked (as can happen with fsync(2)).
Instead bump is just before the last VOP_STRATEGY(), or before calling
nestiobuf_done().
Thanks to Taylor R Campbell for review.


# 1.239 02-Jan-2015 christos

We have three sets of DTYPE_ constants in the kernel:
altq Drop Type
disklabel Disk Type
file Descriptor Type
(not to mention constants that contain the string DTYPE).
Let's make them two, by changing the disklabel one to be DisK TYPE since the
other disklabel constants seem to do that. Not many userland programs use
these constants (and the ones that they do are mostly in ifdefs). They will
be fixed shortly.


# 1.238 31-Dec-2014 christos

make more drivers use disk_ioctl, and add a dev parameter to it so that
we can merge the "easy" disklabel ioctls to it. Ultimately all this will
go do dk_ioctl once all the drivers have been converted.


# 1.237 31-Dec-2014 christos

Centralize wedge ioctls in disk_ioctl.


# 1.236 31-Dec-2014 mlelstv

disk_blocksize and disk_set_info relay the same information
to the disk subsystem.

Make disk_set_info also set blocksize shift values.
Remove every call to disk_blocksize.

Keep disk_blocksize for ABI compatibility, make it also set dg_secsize.


# 1.235 29-Dec-2014 mlelstv

Fix default label for non-standard sector size.
Avoid integer overflow in sanity check.


Revision tags: nick-nhusb-base
# 1.234 04-Nov-2014 mlelstv

branches: 1.234.2;
support DIOCMWEDGES ioctl.


# 1.233 11-Oct-2014 mlelstv

clamp total number of sectors to UINT32_MAX instead of providing the
lower 32bit of the 64bit number.


Revision tags: netbsd-7-base tls-earlyentropy-base tls-maxphys-base
# 1.232 25-Jul-2014 dholland

branches: 1.232.2;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.


# 1.231 25-Jul-2014 dholland

Add d_discard to all struct bdevsw instances I could find.

I've set them all to nodiscard. Some of them (wd, dk, vnd, ld,
raidframe, maybe cgd) should be implemented for real.


# 1.230 22-Jul-2014 pooka

Fix MODULE() dependencies to account for VND_COMPRESSION


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 rmind-smpnet-nbase rmind-smpnet-base
# 1.229 22-Mar-2014 prlw1

branches: 1.229.2;
DIOCGDISKINFO support for vnd
Reviewed by apb@ and christos@


Revision tags: riastradh-drm2-base3
# 1.228 16-Mar-2014 dholland

Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.


# 1.227 29-Dec-2013 pgoyette

Modularize net/zlib so it can be used by the vnd module (and, eventually,
by an opencrypto module).


# 1.226 15-Sep-2013 martin

Remove unused variable


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.225 09-Jun-2013 christos

branches: 1.225.2;
Never return ENXIO in ioctl anymore. We don't have a fixed number of vnd's
configured.


# 1.224 03-Jun-2013 christos

widen the operation on the RHS as suggested in the PR.


# 1.223 03-Jun-2013 christos

PR/47879: Takahiro HAYASHI: vnd cannot handle disk image larger than 2TiB
change size_t to uint64_t where needed.


# 1.222 29-May-2013 christos

phase 1 of disk geometry cleanup:
- centralize the geometry -> plist code so that we don't have
n useless copies of it.


Revision tags: agc-symver-base yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.221 09-Jun-2012 mlelstv

branches: 1.221.2;
autodiscover wedges


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8
# 1.220 26-Mar-2012 hannken

When backed by a sparse file limit the number of pending requests.

Should fix PR #45829: "writing to vnd on sparse file blocks on pager_map"
where the pager_map gets exhausted by requests enqueued on a vnd
device and the device worker thread blocks on putpages() needing the map.

While here always sync the underlying vnode before calling biodone().

XXX: vnd should be converted to mutex/condvar.


Revision tags: jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3 jmcneill-usbmp-pre-base2 jmcneill-usbmp-base2 netbsd-6-base jmcneill-usbmp-base jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.219 14-Oct-2011 hannken

branches: 1.219.2; 1.219.6; 1.219.8;
Change the vnode locking protocol of VOP_GETATTR() to request at least
a shared lock. Make all calls outside of file systems respect it.

The calls from file systems need review.

No objections from tech-kern.


# 1.218 29-Jun-2011 hannken

Make vnd(4) work on sparse files:
- Make the strategy decision a device flag and set VNF_USE_VN_RDWR for
files known to be sparse.
- Change handle_with_rdwr() to use POSIX_FADV_NOREUSE advise to disable
read ahead and keep the size of mapped pages below 1 MByte.

No objections on tech-kern@.


# 1.217 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.216 23-May-2011 joerg

branches: 1.216.2;
Don't use the device name as format string.


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.215 08-Feb-2011 rmind

Remove clause 3 (UCB advertising clause) from the University of Utah
copyright. Confirmed by Mike Hibler, mike at cs.utah.edu - thanks!
Also, merge UCB and Utah copyright texts back into one, as they
originally were.

Extra verification by snj@.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.214 19-Nov-2010 dholland

branches: 1.214.2; 1.214.4;
Introduce struct pathbuf. This is an abstraction to hold a pathname
and the metadata required to interpret it. Callers of namei must now
create a pathbuf and pass it to NDINIT (instead of a string and a
uio_seg), then destroy the pathbuf after the namei session is
complete.

Update all namei call sites accordingly. Add a pathbuf(9) man page and
update namei(9).

The pathbuf interface also now appears in a couple of related
additional places that were passing string/uio_seg pairs that were
later fed into NDINIT. Update other call sites accordingly.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.213 19-Sep-2010 mrg

actually, put the old definitions back into vndvar.h, under _KERNEL,
as netbsd32 wants access to them.


# 1.212 19-Sep-2010 mrg

fix the vnd_osize changes on 32 bit platforms with 64 bit alignment for
64 bit integers (eg, sparc). the problem was that the new 64 bit
element on the end was used for the offsetof() (aka size) for the old
structure, but this includes the padding required, thus the ioctl number
was set wrongly.

move all the supporting code for this inside COMPAT_50, with some renaming
to suit, and kill all the external definitions related to it.


tested on i386, amd64 and sparc.


# 1.211 19-Sep-2010 mrg

add support for COMPAT_50 ioctls. struct vnd_user has a dev_t component
which grew since netbsd 5.0 (hi christos!)

fix a few issues/problems:
- the COMPAT_30 code wasn't used since opt_compat_netbsd.h wasn't included
- move 'struct vnd_ouser' (for COMPAT_30) into vnd.c itself, and call it
'struct vnd_user30'
- same for VNDIOOCGET -> VNDIOCGET30

now 'vnconfig -l' works on -current with a netbsd-5 binary, using i386.


XXX: there is still a potential problem with the old VNDIOOCSET and
VNDIOOCCLR macros on some platforms like sparc. there is padding
between the old vnd_osize member and the new vnd_size member on
platforms that want 64 bit values 64 bit aligned, but are 32 bit
otherwise (like sparc.) 64 bit systems already end up with this
member 64 bit aligned, and should be fine.

this most likely results in the old ioctl numbers being wrong and
the code won't match/run ever (ENOTTY.)


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.210 24-Jun-2010 riz

Add wedge (dk(4)) support to vnd(4) devices.


# 1.209 24-Jun-2010 hannken

Clean up vnode lock operations pass 2:

VOP_UNLOCK(vp, flags) -> VOP_UNLOCK(vp): Remove the unneeded flags argument.

Welcome to 5.99.32.

Discussed on tech-kern.


Revision tags: uebayasi-xip-base1 yamt-nfs-mp-base9
# 1.208 02-Mar-2010 pooka

branches: 1.208.2;
For the nfs throttling kludge, test against v_tag == VT_NFS instead
of v_op (the latter imposes linkage).


Revision tags: uebayasi-xip-base
# 1.207 31-Jan-2010 mlelstv

branches: 1.207.2;
Properly register blocksize with disk(9) framework.


# 1.206 23-Jan-2010 bouyer

struct buf::b_iodone is not called at splbio() any more.
Make sure non-MPsafe iodone callbacks raise the SPL as appropriate.
Fix buffer corruption issue I noticed in dk(4), and probable similar
issues in vnd(4) and cgd(4).


Revision tags: matt-premerge-20091211
# 1.205 06-Dec-2009 dsl

Make vnd_size (the returned size) 64 bit, keeping old field for ioctl
compatibility. Both fields are now unsigned.
Add compatibility for the old ioctl size.
Detect and error files which are definitely sparse (va_bytes < va_size).
Part of fix for PR/41873.


Revision tags: yamt-nfs-mp-base8 yamt-nfs-mp-base7 jym-xensuspend-nbase
# 1.204 07-Aug-2009 dyoung

Re-use DK_BUSY().


Revision tags: jymxensuspend-base yamt-nfs-mp-base6
# 1.203 07-Jul-2009 dyoung

At the bottom of vndclear(), clear VNF_CLEARING: it is no longer
needed to exclude vndopen(), and it will prevent subsequent opens
if we leave it.


# 1.202 02-Jul-2009 dyoung

In vndopen(), release the lock before returning ENXIO.


Revision tags: yamt-nfs-mp-base5 yamt-nfs-mp-base4 jym-xensuspend-base
# 1.201 07-May-2009 cegger

struct cfdata * -> cfdata_t, no functional changes intended.


# 1.200 06-May-2009 ad

Unneeded LK_CANRECURSE.


Revision tags: yamt-nfs-mp-base3
# 1.199 30-Apr-2009 dyoung

Flesh out vnd_detach(). Let the system detach vnd(4) at shutdown. Stop
vnd_ioctl(VNDIOCCLR) from racing with vndopen() to call vndclear().


# 1.198 30-Apr-2009 dyoung

Fix spelling. if( -> if (. No functional change intended.


# 1.197 30-Apr-2009 dyoung

Use NULL instead of (type *)0. Delete extraneous parentheses. No
functional change intended.


Revision tags: nick-hppapmap-base4 nick-hppapmap-base3 nick-hppapmap-base
# 1.196 18-Mar-2009 cegger

bzero -> memset


# 1.195 14-Mar-2009 apb

Add FSYNC_CACHE flag to the VOP_FSYNC() call for the DIOCCACHESYNC ioctl.
PR 41015.


# 1.194 14-Mar-2009 christos

PR/41015: Alan Barrett: vnd driver does not implement DIOCCACHESYNC


Revision tags: nick-hppapmap-base2
# 1.193 05-Feb-2009 haad

branches: 1.193.2;
Add support for the MODULAR framework to the vnd driver. Enable building of
vnd.kmod by default.


Revision tags: mjf-devfs2-base
# 1.192 13-Jan-2009 yamt

g/c BUFQ_FOO() macros and use bufq_foo() directly.


# 1.191 11-Jan-2009 cegger

make this compile


# 1.190 11-Jan-2009 christos

merge christos-time_t


Revision tags: haad-dm-base2 haad-nbase2 ad-audiomp2-base christos-time_t-nbase haad-dm-base christos-time_t-base
# 1.189 19-Nov-2008 bouyer

Check that vnd is not NULL before using it, return ENXIO if it is.
Avoids a panic when vnconfig -uF is used on a busy vnd.


# 1.188 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-base matt-mips64-base2 haad-dm-base1 wrstuden-revivesa-base-4
# 1.187 24-Sep-2008 ad

branches: 1.187.2; 1.187.4;
PR kern/38872 vnconfig panics with rw lock error

Pass IO_NODELOCKED where needed.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2 simonb-wapbl-nbase simonb-wapbl-base
# 1.186 19-Jul-2008 kardel

buf_destroy() an embedded buffer before returning memory to pool
issue detected by LOCKDEBUG panicing about "allocation contains active lock"


Revision tags: wrstuden-revivesa-base-1 wrstuden-revivesa-base
# 1.185 17-Jun-2008 cegger

branches: 1.185.2;
Disable the check introduced in rev. 1.184. It works in first place, but not in second place.

The new check is not enough to detect sparse files reliably.
per discussion with bouyer


Revision tags: yamt-pf42-base4
# 1.184 14-Jun-2008 cegger

Checking if the underlying file system supports VOP_BMAP and VOP_STRATEGY does not imply that works.
Test if VOP_BMAP actually works before using bmap/strategy.

When you create an image with

dd if=/dev/zero of=./netbsd.img bs=1m count=1 seek=1000

then the current check actually determines the "file system"
in the image supports VOP_BMAP and VOP_STRATEGY, but VOP_BMAP can't
translate any logical block numbers which results in EIO failures.

When you try to access the image in a Xen DomU you see all disk operations
failing. Therefore test if VOP_BMAP actually works and fall back to
VOP_READ/VOP_WRITE if it doesn't.

This makes a Xen DomU installation working. When you boot your fresh
installed Xen DomU with a valid disklabel and file system in the image,
VOP_BMAP actually works and is used.

This allows you to create an image with dd as above on the Dom0 and
run a DomU installation or to quickly create another virtual disk for
an existing DomU without having to create a disklabel and file system
by hand.


# 1.183 14-Jun-2008 cegger

add closing bracket in debug message


# 1.182 10-Jun-2008 cegger

device_private(device_lookup()) -> device_lookup_private()
ok cube@


Revision tags: yamt-pf42-base3 hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.181 05-May-2008 ad

branches: 1.181.2; 1.181.4;
Back out previous. It broke the build.


# 1.180 04-May-2008 ad

Move zlib out of net/ and into kern/. It would probably be better to use
the reachover Makefiles and libz, but this is already here and it works.


# 1.179 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-pf42-baseX yamt-nfs-mp-base yamt-pf42-base
# 1.178 09-Apr-2008 cegger

branches: 1.178.2; 1.178.4;
use aprint_*_dev and device_xname


Revision tags: ad-socklock-base1 yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.177 21-Mar-2008 ad

branches: 1.177.2;
Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.176 04-Mar-2008 cube

Split device_t/softc and other related cosmetic changes.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase bouyer-xeni386-base mjf-devfs-base matt-armv6-base hpcarm-cleanup-base
# 1.175 02-Jan-2008 ad

branches: 1.175.2; 1.175.6;
Merge vmlocking2 to head.


Revision tags: vmlocking2-base3
# 1.174 18-Dec-2007 riz

Add disk-info properties to vnd(4), for use by userland tools
such as gpt(8).


Revision tags: yamt-kmem-base3
# 1.173 12-Dec-2007 smb

Add power management hooks


Revision tags: cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.172 08-Dec-2007 pooka

branches: 1.172.2; 1.172.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 vmlocking-nbase reinoud-bufcleanup-base
# 1.171 26-Nov-2007 pooka

branches: 1.171.2;
Remove the "struct lwp *" argument from all VFS and VOP interfaces.
The general trend is to remove it from all kernel interfaces and
this is a start. In case the calling lwp is desired, curlwp should
be used.

quick consensus on tech-kern


Revision tags: jmcneill-base bouyer-xenamd64-base2 yamt-x86pmap-base4 bouyer-xenamd64-base yamt-x86pmap-base3 vmlocking-base
# 1.170 08-Oct-2007 ad

branches: 1.170.4;
Merge disk init changes from the vmlocking branch. These seperate init /
destroy of 'struct disk' from attach / detach.


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base2 yamt-x86pmap-base matt-mips64-base
# 1.169 29-Jul-2007 ad

branches: 1.169.4; 1.169.6; 1.169.8; 1.169.10;
It's not a good idea for device drivers to modify b_flags, as they don't
need to understand the locking around that field. Instead of setting
B_ERROR, set b_error instead. b_error is 'owned' by whoever completes
the I/O request.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.168 09-Jul-2007 ad

branches: 1.168.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.167 07-Apr-2007 hannken

Remove calls to now obsolete vn_start_write() and vn_finished_write().


# 1.166 12-Mar-2007 ad

branches: 1.166.2;
Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.


# 1.165 09-Mar-2007 yamt

branches: 1.165.2;
use b_bcount where appropriate rather than keeping b_resid sync with it.
no functional changes.


# 1.164 05-Mar-2007 christos

Fix compression problem from Cliff Wright:
Break one: because b_cylinder, and b_resid are one and the same.
The work the routine was commented as being its primary function,
end of patition adjustment, is wiped out, as at the end of
the routine b_cylinder is set, splat, doing a wipe out of the adjustment
to b_resid.
Break two: When doing the adjustment, a block count is created from a
byte count, a block calculation is done, then the results are compared to
the original byte count. i.e. apples to oranges, not blocks to blocks,
but blocks to bytes.
Break three: since if all the other drivers that used this routine
would have broken as vnd compress did, I must assume they always ignored
the results of this routine. So if end of patition adjustment is
really required then all these other drivers have been broken for a
long time.


# 1.163 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


Revision tags: ad-audiomp-base
# 1.162 21-Feb-2007 thorpej

Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.


Revision tags: post-newlock2-merge newlock2-nbase newlock2-base
# 1.161 28-Jan-2007 chs

branches: 1.161.2;
don't print b_resid when it's not valid.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4 yamt-splraiseipl-base3 netbsd-4-base
# 1.160 16-Nov-2006 christos

branches: 1.160.2; 1.160.4;
__unused removal on arguments; approved by core.


# 1.159 11-Nov-2006 jmmv

Do not mess with B_VFLUSH in the read/write case. Requested by yamt@.

(I did this because the system could panic otherwise. But this seemed to
be a side-effect of another mistake that was present in the code before it
was commited. So effectively this simplification should have happened
before.)


# 1.158 10-Nov-2006 martin

size_t is not always == int


# 1.157 09-Nov-2006 jmmv

Check if the underlying file system supports VOP_BMAP and VOP_STRATEGY and,
if not, fallback to VOP_READ and VOP_WRITE. This makes vnd work with files
on, e.g. tmpfs and smbfs; all file systems should behave as before.
OK'ed by silence in tech-kern@.


Revision tags: yamt-splraiseipl-base2
# 1.156 17-Oct-2006 dogcow

now that we have -Wno-unused-parameter, back out all the tremendously ugly
code to gratuitously access said parameters.


# 1.155 14-Oct-2006 dogcow

more super exciting fun unused arguments.


# 1.154 12-Oct-2006 christos

- sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386


# 1.153 25-Sep-2006 cube

Don't accept a compressed image that has 0 for the block size...
Fixes PR#34608.


# 1.152 23-Sep-2006 elad

Use u_quad_t and not uint64_t (even though they might be the same),
pointed out by tsutsui@.


# 1.151 23-Sep-2006 elad

PR/34589: Cliff Wright: vnd(4) compress error with large files
Applied slightly different patch (u_int64_t -> uint64_t), thanks!


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9 rpaulo-netinet-merge-pcb-base
# 1.150 03-Sep-2006 bouyer

branches: 1.150.2; 1.150.4;
Back out rev 1.149.
From various discussion about vndstrategy (see
http://mail-index.netbsd.org/tech-kern/2005/03/29/0034.html
http://mail-index.netbsd.org/tech-kern/2005/03/23/0015.html)
it's not correct to tsleep() in a strategy routine, which may be called from
interrupt context.
Unfortunably this reopens PR/10731, PR/12189, PR/20296, PR/34293

As for what the correct fix it, this needs to be analysed deeper. I suspect
throttling the caller in vnd only hides the problem; the same caller writing
to some other device could exaust all buffers as well. If this driver doesn't
need to allocate buffer this won't cause a deadlock, but it's bad for
performances on systems with e.g. multiple drives. Also, others stacked
block device drivers may also have this issue.


Revision tags: yamt-pdpolicy-base8
# 1.149 27-Aug-2006 christos

PR/34293: Michael van Elst: vnd deadlocks on I/O buffers
Also fixes: PR/10731, PR/12189, PR/20296
Sleep while there a buffer shortage.


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base7
# 1.148 21-Jul-2006 ad

- Use the LWP cached credentials where sane.
- Minor cosmetic changes.


Revision tags: yamt-pdpolicy-base6 chap-midi-nbase gdamore-uart-base yamt-pdpolicy-base5 chap-midi-base simonb-timecounters-base
# 1.147 14-May-2006 elad

integrate kauth.


Revision tags: yamt-pdpolicy-base4 yamt-pdpolicy-base3 elad-kernelauth-base
# 1.146 30-Mar-2006 cube

Silence device creation and destruction. That means there won't be
spurious messages when doing "vnconfig -l", but it also means there won't
be a message when an actual device is created. Oh, well.

PR#33116 by Izumi Tsutsui.


# 1.145 29-Mar-2006 thorpej

Use device_cfdata().


# 1.144 28-Mar-2006 thorpej

Use device_unit().


# 1.143 21-Mar-2006 dogcow

as yamt and bouyer pointed out, there are a few other cases where l should
be checked for NULL, if one's going to do it at all, and that the proper idiom
is KASSERT, not panic.


# 1.142 18-Mar-2006 dogcow

in VNDIOCGET, make sure there's a valid lwp. coverity CID 837.


Revision tags: peter-altq-base yamt-pdpolicy-base2 yamt-pdpolicy-base
# 1.141 01-Mar-2006 yamt

branches: 1.141.2; 1.141.4; 1.141.6;
merge yamt-uio_vmspace branch.

- use vmspace rather than proc or lwp where appropriate.
the latter is more natural to specify an address space.
(and less likely to be abused for random purposes.)
- fix a swdmover race.


Revision tags: yamt-uio_vmspace-base5
# 1.140 04-Feb-2006 yamt

vndthread: fix an integer overflow. fix a panic reported by Simon Burge.


# 1.139 04-Feb-2006 yamt

vndstrategy: do bounds_check_with_mediasize.


# 1.138 04-Feb-2006 yamt

vndthread: play with b_vp and v_numoutput as it used to do,
for "strange" filesystems like nfs. PR/32671 from Simon Burge.

although i'm not really happy with this "fix", i think that the code will
be replaced with direct i/o anyway, sooner or later...


# 1.137 04-Feb-2006 yamt

vnd_cget: remove a wrong comment.


# 1.136 04-Feb-2006 yamt

vnd_destroy: don't access freed memory.


# 1.135 02-Feb-2006 cube

branches: 1.135.2;
Fix typo.


# 1.134 02-Feb-2006 cube

Move the code that destroys the device to vndclose. That way it no longer
returns ENXIO when deconfiguring a vnd.


# 1.133 01-Feb-2006 cube

Free cfdata memory. The tap LKM might be wrong in that area, to...
Pointed out by Greg Oster.


# 1.132 01-Feb-2006 cube

Free the bufq. Pointed out by yamt@.


# 1.131 01-Feb-2006 cube

Have vnd(4) devices automatically created when the user tries to
configure one. That removes the compile-time constant that limits the
number of vnds.

Thanks xtraeme@ for testing.


# 1.130 15-Jan-2006 yamt

branches: 1.130.2;
compstrategy: remove bogus handling of B_PHYS.


# 1.129 11-Jan-2006 yamt

use nestiobuf api for vnd.


# 1.128 11-Jan-2006 yamt

don't set b_rawblkno unnecessarily.
it will be set by device strategy routine.


# 1.127 08-Jan-2006 yamt

do b_blkno -> b_rawblkno translation earlier so that bufq can use it.


# 1.126 07-Jan-2006 yamt

- do disk_busy/unbusy for requested i/o, rather than ones that we reuqest.
- on error, be conservative about b_resid.
- make vndthread static.


# 1.125 14-Dec-2005 bouyer

branches: 1.125.2;
Only VNDIOCSET needs a valid process context, so don't blindly
dereference l, test it l is NULL first.
Fix exporting a vnd device to a XENU domain.


# 1.124 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 yamt-readahead-base2 yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base ktrace-lwp-base
# 1.123 15-Oct-2005 yamt

branches: 1.123.6;
- change the way to specify a bufq strategy. (by string rather than by number)
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)


# 1.122 28-Aug-2005 christos

Fix logic error in vndiocget.


# 1.121 20-Aug-2005 yamt

use pseudo_disk_{init,attach,detach} where appropriate.


# 1.120 19-Aug-2005 christos

64 bit inode changes.


# 1.119 18-Aug-2005 nathanw

Call VOP_UNLOCK() in the case where VND_COMPRESSION isn't defined and
we're about to return EOPNOTSUPP. Prevents a "locking against myself"
panic in vn_close() in the error return path.

Addresses PR# kern/30958


# 1.118 25-Jul-2005 drochner

minor usability improvements:
-fabricate a trivial geometry also in the case of images <=1M
(XXX I didn't add a check for >0 size -- this is generally harmless
because there are enough boundary checks present, and it allows
to test some corner cases in the disklabel handling code)
-ignore the VNF_KLABEL flag -- the vnd device is if limited (and
well-defined!) lifetime anyway, and the implications of "keeplabel"
are confusing at best


# 1.117 18-Jul-2005 christos

Fix typo in previous [thanks jukka]


# 1.116 18-Jul-2005 christos

Fix whitespace issues and use ansi prototypes.


# 1.115 17-Jul-2005 hubertf

Add support for reading cloop2 compressed filesystem image,
enable by putting VND_COMPRESSION into kernel config file.
Written by Cliff Wright, polished up slightly by me.


Revision tags: kent-audio2-base
# 1.114 31-Mar-2005 bouyer

branches: 1.114.2;
Don't eventually leak vnx and bp on unconfigure, pointed out by YAMAMOTO
Takashi. Instead, let the current I/O complete before killing the thread.


# 1.113 31-Mar-2005 yamt

introduce a function to drain bufq and use it where appropriate.


# 1.112 30-Mar-2005 bouyer

Make vnd do I/O to the underlying file from thread context. This
allows the strategy routine to be called from interrupt context, fixes
PR kern/29775 by Juan RP.
Now that pool_get() is only called from thread context, change PR_NOWAIT to
PR_WAITOK. Fix PR kern/26272 by Juergen Hannken-Illjes.
OK'd by thorpej@


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base kent-audio1-beforemerge kent-audio1-base
# 1.111 28-Oct-2004 yamt

branches: 1.111.4; 1.111.10;
move buffer queue related stuffs from buf.h to their own header, bufq.h.


# 1.110 18-Sep-2004 yamt

change some members of struct buf from long to int.
ride on 2.0H.


# 1.109 10-Sep-2004 yamt

vndstrategy/vndiodone:
don't call bgetvp/brelvp here. they are not interrupt safe.
as we're doing something like direct i/o,
there's no point to call them anyway.


# 1.108 30-Aug-2004 thorpej

Use ANSI function decls, sprinkle static.


# 1.107 21-Apr-2004 itojun

kill sprintf, use snprintf


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.106 25-Jan-2004 hannken

branches: 1.106.4;
Make VOP_STRATEGY(bp) a real VOP as discussed on tech-kern.

VOP_STRATEGY(bp) is replaced by one of two new functions:

- VOP_STRATEGY(vp, bp) Call the strategy routine of vp for bp.
- DEV_STRATEGY(bp) Call the d_strategy routine of bp->b_dev for bp.

DEV_STRATEGY(bp) is used only for block-to-block device situations.


# 1.105 10-Jan-2004 yamt

store a i/o priority hint in struct buf for buffer queue discipline.


# 1.104 19-Oct-2003 scw

Be more careful about validating the user-specified geometry, otherwise
it's too easy to specify a geometry which will cause a divide by zero
elsewhere in the disklabel code.


# 1.103 15-Oct-2003 hannken

Add the gating of system calls that cause modifications to the underlying
file system.
The function vfs_write_suspend stops all new write operations to a file
system, allows any file system modifying system calls already in progress
to complete, then sync's the file system to disk and returns. The
function vfs_write_resume allows the suspended write operations to
complete.

From FreeBSD with slight modifications.

Approved by: Frank van der Linden <fvdl@netbsd.org>


# 1.102 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.101 29-Jun-2003 fvdl

branches: 1.101.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.100 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.99 17-May-2003 thorpej

Add DIOCKLABEL support. Fixes PR kern/21605 (Luke Mewburn).


# 1.98 10-May-2003 thorpej

Change bounds_check_with_label() to take a pointer to the disk structure,
rather than the label itself. This paves the way for some future changes.


# 1.97 02-May-2003 dsl

Change return type of readdisklabel() to const char *
I hope I've found all the correct places!


# 1.96 26-Apr-2003 dsl

In the absensce of a netbsd disklabel:
- Use partition size (instead of type) to determine whether a disklabel slot
has been filled in (eg from mbr info on i386).
- Set number of partitions to 16 to stop disklabel bleating.


# 1.95 11-Apr-2003 drochner

Add a VNDIOF_FORCE flag which forces unconfiguration if the emulated
disk is still in use.
Not for everyday use, but we have to face eg USB flash drives being
unplugged at the wrong time, and this is a way to simulate this without
wearing out the connectors.


# 1.94 27-Mar-2003 yamt

read-only configuration support.


# 1.93 01-Mar-2003 enami

Don't require root privilege explicitly to issue ioctl. It should be
controlled by file's attribute.


# 1.92 25-Feb-2003 thorpej

Add a new BUF_INIT() macro which initializes b_dep and b_interlock, and
use it. This fixes a few places where either b_dep or b_interlock were
not properly initialized.


# 1.91 05-Feb-2003 pk

Make the buffer cache code MP-safe.


# 1.90 25-Jan-2003 kleink

Fix further printf format warnings for DEBUG, in the wake of daddr_t
having changed.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base gmcgarry_ctxsw_base gmcgarry_ucred_base nathanw_sa_base
# 1.89 16-Nov-2002 mrg

vnd.c
- allow vnddetach() to return EBUSY if any vnd's are currently initialised.
lkm:
- add new 'dev' directory, initially with just a 'vnd' LKM. for now, the
vnd lkm driver requests 4 devices....

XXX: vnd should be converted to a psuedo-device that creates & deletes
instances of itself (vnd0, vnd1, etc) when vnconfig -c/-u are called,
then the vnd lkm driver can not be limited to '4' by default.


# 1.88 01-Nov-2002 mrg

implement separate read/write disk statistics:
- disk_unbusy() gets a new parameter to tell the IO direction.
- struct disk_sysctl gets 4 new members for read/write bytes/transfers.
when processing hw.diskstats, add the read&write bytes/transfers for
the old combined stats to attempt to keep backwards compatibility.

unfortunately, due to multiple bugs, this will cause new kernels and old
vmstat/iostat/systat programs to fail. however, the next time this is
change it will not fail again.

this is just the kernel portion.


Revision tags: kqueue-aftermerge
# 1.87 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.86 06-Sep-2002 gehenna

Merge the gehenna-devsw branch into the trunk.

This merge changes the device switch tables from static array to
dynamically generated by config(8).

- All device switches is defined as a constant structure in device drivers.

- The new grammer ``device-major'' is introduced to ``files''.

device-major <prefix> char <num> [block <num>] [<rules>]

- All device major numbers must be listed up in port dependent majors.<arch>
by using this grammer.

- Added the new naming convention.
The name of the device switch must be <prefix>_[bc]devsw for auto-generation
of device switch tables.

- The backward compatibility of loading block/character device
switch by LKM framework is broken. This is necessary to convert
from block/character device major to device name in runtime and vice versa.

- The restriction to assign device major by LKM is completely removed.
We don't need to reserve LKM entries for dynamic loading of device switch.

- In compile time, device major numbers list is packed into the kernel and
the LKM framework will refer it to assign device major number dynamically.


Revision tags: gehenna-devsw-base
# 1.85 12-Aug-2002 enami

No longer need to calculate geomsize if we use fictitious geometry.


# 1.84 02-Aug-2002 oster

Do not truncate the size of the vnd, as that will cause lossage with
images with sizes that are not multiples of 1MB. Fix as proposed by me on
tech-kern, and ok'ed by Christos.


# 1.83 26-Jul-2002 enami

Don't sprinkle cleanup code here and there (necessary cleanup was missed).


# 1.82 21-Jul-2002 hannken

Rename bufq_init() to bufq_alloc().
Add bufq_free() to remove a buffer queue.
Avoid MALLOC while holding a spinlock.

From Chuck Silvers.


# 1.81 20-Jul-2002 hannken

Convert to new device buffer queue interface.


# 1.80 21-Jun-2002 atatat

Provide a means for vnconfig to indicate which devices are in use, and
by which files (hmm...why can't I unmount that file system over
there). Currently this is just the device and inode number of the
file backing the vnd, but hopefully consing up full pathnames can be
done at some point.


Revision tags: netbsd-1-6-base
# 1.79 02-May-2002 briggs

branches: 1.79.2; 1.79.4;
Ensure that b_bufsize is set to a range covering the buffer in vndstrategy().
This addresses kern/16570 where using the raw vnd device with a file backed by
NFS was failing due to bp->b_bufsize being 0.


Revision tags: eeh-devprop-base newlock-base
# 1.78 08-Mar-2002 thorpej

Pool deals fairly well with physical memory shortage, but it doesn't
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:

* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.

From art@openbsd.org.


Revision tags: ifpoll-base
# 1.77 13-Jan-2002 tsutsui

Call malloc(9) with M_ZERO flag instead of memset() after malloc().


# 1.76 04-Jan-2002 mrg

add a vnddetach: it just free()s the vnd_softc. useful for LKM.


# 1.75 13-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-mips-cache-base
# 1.74 22-Oct-2001 mrg

use _KERNEL_OPT so this can be built as an LKM.


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2
# 1.73 30-Sep-2001 chs

in vndstrategy(), handle the underlying file being force-unmounted.


Revision tags: post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.72 07-Jul-2001 thorpej

branches: 1.72.2; 1.72.4;
bcopy -> memcpy
bzero -> memset


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.71 08-Jan-2001 fvdl

branches: 1.71.2;
Return error in the case of using ODIOCGDINFO or ODIOCGDEFLABEL when
the number of partitions is > OLDMAXPARTITIONS. This is better
than silently truncating the label (don't want to silently throw
away partitions when using an old disklabel binary on a label with
> 8 partitions). From Enami Tsugutomo.


# 1.70 07-Jan-2001 fvdl

Adapt all disk devices in MI directories to handle ODIOC* calls
for ports that have bumped MAXPARTITIONS (and thus define
__HAVE_OLD_DISKLABEL).


# 1.69 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.68 12-Sep-2000 enami

Define an auto variable `bn' as off_t instead of int since it is finally
converted to byte offset.


# 1.67 20-Aug-2000 pk

Remove duplicate `flags' from printf format string.


# 1.66 19-Aug-2000 pk

When breaking up a transfer in vndstrategy(), only inherit B_READ and
B_ASYNC from the original buffer's flags.


Revision tags: netbsd-1-5-ALPHA2 netbsd-1-5-base minoura-xpg4dl-base
# 1.65 30-Mar-2000 augustss

branches: 1.65.4;
Remove register declarations.


Revision tags: chs-ubc2-newbase
# 1.64 07-Feb-2000 thorpej

Fix a bug in disksort_*() which caused non-optimal ordering when multiple
active partitions were on a single spindle. Add a b_rawblkno member to
struct buf which contains the non-partition-relative block number to sort
by.


# 1.63 21-Jan-2000 thorpej

Update for sys/buf.h/disksort_*() changes.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.62 17-Nov-1999 fvdl

Initialize buffer dependencies list for soft updates when initializing
a new buffer.


Revision tags: comdex-fall-1999-base fvdl-softdep-base chs-ubc2-base
# 1.61 21-Apr-1999 thorpej

branches: 1.61.2; 1.61.8;
Add a couple of missing splbio()/splx() pairs that caused pool corruption.


Revision tags: netbsd-1-4-base kenh-if-detach-base
# 1.60 12-Nov-1998 thorpej

branches: 1.60.4;
Must use PR_NOWAIT when allocating component buffers.


Revision tags: chs-ubc-base
# 1.59 31-Jul-1998 thorpej

Use the pool allocator for vndxfer and vndbuf structures.


Revision tags: eeh-paddr_t-base
# 1.58 12-Mar-1998 bouyer

branches: 1.58.2;
Better fix for PR 5113, per discussion with fvdl: now that the vnode locking
interface allow recusive locks, use it instead of the local hack to avoid
recursive locking.


# 1.57 04-Mar-1998 fvdl

Fix vn_lock argument botch. From Manuel Bouyer (PR 5113).


# 1.56 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.55 19-Feb-1998 thorpej

Include the NFS option header.


# 1.54 26-Jan-1998 bad

In vndsetcred(): after establishing credentials, flush all buffers
associated with the vnode from the buffer cache. This obviates the
need to flush the buffer cache manually after ``vnconfig -u''.


# 1.53 31-Dec-1997 enami

Fix a typo in panic string.


# 1.52 31-Dec-1997 enami

When building fake disklabel, if a partition type is other than FS_UNUSED
don't override it so that port specific hack takes effect.


# 1.51 02-Dec-1997 pk

Pull over fixes from vm_swap.c:
- guard against synchronous I/O completion
- avoid race conditions
- use bgetvp/brelvp to properly maintain the vnode holdcount
and clean/dirty buffer lists.


Revision tags: netbsd-1-3-BETA netbsd-1-3-base
# 1.50 20-Oct-1997 fvdl

branches: 1.50.2;
Do hack around VOP_BMAP call to avoid recursive locks. The locking
interface doesn't allow currently to specify recursive locks.
Should fix vnd device locking panics.


Revision tags: marc-pcmcia-base
# 1.49 10-Oct-1997 mycroft

Add a missing splx(). From augustss.


# 1.48 09-Oct-1997 jtc

Fix tipo inherited from old version of TNF copyright template.


# 1.47 08-Oct-1997 thorpej

Implement DIOCGDEFLABEL.


Revision tags: thorpej-signal-base
# 1.46 26-Aug-1997 thorpej

Add missing call to vndunlock(), per PR #3811, enami tsugutomo.


# 1.45 04-Aug-1997 fair

%x -> 0x%x


Revision tags: marc-pcmcia-bp
# 1.44 26-Jun-1997 kleink

branches: 1.44.4;
Leftover from last commit: require us to be initialized when a DIOCGDINFO
ioctl(2) is issued; the uninitialized disklabel pointer might get dereferenced
otherwise.


# 1.43 26-Jun-1997 thorpej

In vndioctl(), consolidate checks for "open for writes" and "initialized",
and cover all of the cases we're supposed to cover.


# 1.42 26-Jun-1997 thorpej

Remove an outdated comment that is not true with the Mach VM system.


# 1.41 23-Jun-1997 thorpej

Add full disklabel and partition support to the vnd driver, allowing much
greater flexibility in its use. Additionally, add support for "geometry
emulation". This allows the "geometry" of the "disk" to be specified
at config time, providing near-perfect emulation of disklabel-less floppies,
CD-ROMs, etc., including non-512-byte sectors. If a geometry is not
specified at config time, a default based on 1M cylinders will be used.


# 1.40 08-Jun-1997 pk

Remove attempt to use files with holes; it's prone to deadlocks.


# 1.39 08-Jun-1997 pk

Avoid race for residual count and pending requests count.


# 1.38 26-May-1997 pk

Pass correct offset to vn_rdwr().


# 1.37 25-May-1997 pk

Add code (#ifdef'ed VND_FILLHOLES for now) to fall back on vn_rdwr()
if VOP_BMAP() does not produce a translation.

IO_SYNC is used to prevent dirty file cache buffers. On a ffs filesystem,
once a hole is filled, subsequent vnd accesses find will find valid
VOP_BMAP() translations.

Concerns:
* is the assumed semantics correct for all filesystems?
* do we actually want the automagic extension on the VND
backing store..


# 1.36 25-May-1997 pk

Use an additional structure to keep information on a set of transfers
initiated by vnd_strategy(). This allows for more natural error handling
and solves two bugs:
* vnd can disk_unbusy without disk_busy (PR#2657)
* b_resid is set correctly on the external at the end
of a transfer in case of an error.


# 1.35 25-May-1997 pk

Pass correct transfer count to disk_unbusy().


# 1.34 19-May-1997 pk

Avoid negative values for `b_dirtyend' and `b_validend'.


# 1.33 19-May-1997 pk

Fill in b_dirtyoff/b_dirtyend and b_validoff/b_validend appropriately
in each auxiliary buffer; the strategy routines (esp. NFS's) like that.


# 1.32 12-Mar-1997 mycroft

Remove bogus use of splhigh(), and apparently unneeded bzero().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.31 31-Jan-1997 thorpej

NFSCLIENT -> NFS


# 1.30 13-Oct-1996 christos

backout kprintf changes


# 1.29 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.28 25-Sep-1996 christos

Avoid problems with ptrdiff_t and size_t being different on different
architectures, by adding explicit casts to printf arguments. Is there
a better way to do this?


# 1.27 10-Jul-1996 cgd

print difference between pointers with %ld, so that -Wformat works
on the Alpha and for consistency. Also, other minor formatting cleanups.


Revision tags: netbsd-1-2-PATCH001 netbsd-1-2-RELEASE netbsd-1-2-BETA netbsd-1-2-base
# 1.26 30-Mar-1996 christos

Remove dependencies to dev_conf.h and the file itself.


# 1.25 16-Mar-1996 christos

fix printf() formats


# 1.24 10-Feb-1996 christos

vnd.c: Typo (disk_deta{t,}ch) It was detach in the header file and
detatch everywhere else. Reverted to the english spelling.
Also fixed the rest of the prototype warnings while I was at it.
ic/ncr5380sbc.c: Don't declare Debugger()... I have to clean this
everywhere :-(


# 1.23 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.22 06-Nov-1995 thorpej

Bring in several changes from the ccd:
* Be a bit better with prototypes
* Use struct dkdevice in vnd_softc.
* Prevent the unit from being unconfigured while open.
* Implement a simple locking mechanism and use it for sanity's
sake.
Still needs more work; needs to support disklabels and the like.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.21 05-Oct-1995 mycroft

Lock the underlying vnode around VOP_BMAP() and VOP_READ(). From John Kohl.


# 1.20 04-Jul-1995 mycroft

Make each disk and tape driver define its own read and write functions.
Deprecate rawread() and rawwrite() completely. Remove d_strategy from cdevsw to
force the abstraction barrier.


# 1.19 26-Jun-1995 cgd

make dump stubs consistent


# 1.18 27-Feb-1995 cgd

use a buf-within-struct to avoid overloading b_pfcent.


# 1.17 25-Jan-1995 cgd

vn -> vnd renaming, for consistency


# 1.16 24-Dec-1994 cgd

various cleanups for -Wall, suggested by James Jegers.


# 1.15 14-Dec-1994 mycroft

Remove extra arg to d_open and vn_open().


# 1.14 14-Nov-1994 christos

added extra argument to vn_open


# 1.13 30-Oct-1994 cgd

be more careful with types, also pull in headers where necessary.


# 1.12 24-Aug-1994 cgd

fix (bogus) default ioctl return.


# 1.11 29-Jun-1994 cgd

branches: 1.11.2;
New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.10 27-Jun-1994 cgd

new standard, minimally intrusive ID format


# 1.9 14-Jun-1994 cgd

kill some pre-lite code that was #ifdef'd appropriately


# 1.8 24-May-1994 cgd

provide upgrade notice


# 1.7 11-May-1994 mycroft

Get rid of the private vnread() and vnwrite(); they are the same as rawread()
and rawwrite().


# 1.6 11-May-1994 mycroft

Device strategy functions return void again.


# 1.5 20-Apr-1994 cgd

update from hibler


# 1.4 27-Feb-1994 deraadt

add vnclose function


# 1.3 21-Dec-1993 cgd

fix that last


# 1.2 21-Dec-1993 brezak

Tweak for BSD44/NetBSD environ.


# 1.1 21-Dec-1993 brezak

vnode driver (from Mike Hibler make@cs.utah.edu)