History log of /freebsd-current/sys/fs/msdosfs/msdosfs_denode.c
Revision Date Author Comments
# 445d3d22 20-Feb-2024 Stefan Eßer <se@FreeBSD.org>

msdosfs: fix potential inode collision on FAT12 and FAT16

FAT file systems do not use inodes, instead all file meta-information
is stored in directory entries.

FAT12 and FAT16 use a fixed size area for root directories, with
typically 512 entries of 32 bytes each (for a total of 16 KB) on hard
disk formats. The file system data is stored in clusters of typically
512 to 4096 bytes, depending on the size of the file system.

The current code uses the offset of a DOS 8.3 style directory entry as
a pseudo-inode, which leads to inode values of 0 to 16368 for typical
root directories with 512 entries.

Sub-directories use 2 cluster length plus the byte offset of the
directory entry in the data area for the pseudo-inode, which may be
as low as 1024 in case of 512 byte clusters. A sub-directory in
cluster 2 and with 512 byte clusters will therefore lead to a
re-use of inode 1024 when there are at least 32 DOS 8.3 style
filenames in the root directory (or 11 14-character Windows
long file names, each of which takes up 3 directory entries).

FAT32 file systems are not affected by this issue and FAT12/FAT16
file systems with larger cluster sizes are unlikely to have as
many directory entries in the root directory as are required to
cause the collision.

This commit leads to inode numbers that are guaranteed to not collide
for all valid FAT12 and FAT16 file system parameters. It does also
provide a small speed-up due to more efficient use of the vnode cache.

Approved by: mckusick
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D43978


# 71625ec9 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c comment pattern

Remove /^/[*/]\s*\$FreeBSD\$.*\n/


# 7e4c6b21 05-Jul-2023 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: zero partially valid extended cluster

It contains arbitrary garbage, which is not cleared by vfs_bio_clrbuf()
which only zeroes invalid portions of the pages.

Reported by: Maxim Suhanov <dfirblog@gmail.com>
Discussed with: so
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 2d8cf575 08-Mar-2023 Stefan Eßer <se@FreeBSD.org>

msdosfs: fix debug print format and parameter

Building with -DMSDOSFS_DEBUG failed due to a format mismatch and
a variable that has been renamed but not updated in the printf()
parameter list.

MFC after: 1 month


# 0152d453 11-Feb-2023 Konstantin Belousov <kib@FreeBSD.org>

msdosfs deextend: validate pages of the partial buffer

Suppose that the cluster size is larger than page size. If the buffer
at the old EOF (before extending) was partial and dirty, it cannot be
automatically neither written out nor validated by the buffer cache,
since extending buffer adds invalid pages at the end.

Correct the buffer state by calling vfs_bio_clrbuf() on it, to mark
newly added and zeroed pages as valid.

Note that UFS is immune to the problem because ffs_truncate() always
allocate the block and buffer for the last byte of the file.

PR: 269341
Reported by: asomers
In collaboration with: pho
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D38549


# 67dc1e7b 11-Feb-2023 Konstantin Belousov <kib@FreeBSD.org>

msdosfs deextend(): memoize DETOV(dep)

Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D38549


# e59180ea 09-Feb-2023 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: correct handling of vnode pager size on file extension error

If extension fails, vnode pager recorded size might be left increased.
Only update vnode pager when extension is past the point of no rollback.

Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D38549


# 829f0bcb 19-Dec-2022 Mateusz Guzik <mjg@FreeBSD.org>

vfs: add the concept of vnode state transitions

To quote from a comment above vput_final:
<quote>
* XXX Some filesystems pass in an exclusively locked vnode and strongly depend
* on the lock being held all the way until VOP_INACTIVE. This in particular
* happens with UFS which adds half-constructed vnodes to the hash, where they
* can be found by other code.
</quote>

As is there is no mechanism which allows filesystems to denote that a
vnode is fully initialized, consequently problems like the above are
only found the hard way(tm).

Add rudimentary support for state transitions, which in particular allow
to assert the vnode is not legally unlocked until its fate is decided
(either construction finishes or vgone is called to abort it).

The new field lands in a 1-byte hole, thus it does not grow the struct.

Bump __FreeBSD_version to 1400077

Reviewed by: kib (previous version)
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D37759


# 0721306c 04-Sep-2022 Gordon Bergling <gbe@FreeBSD.org>

msdosfs(5): Remove a double word in a source code comment

- s/that that/that/

MFC after: 3 days


# 303d3ae7 31-Jan-2022 Konstantin Belousov <kib@FreeBSD.org>

ufs, msdosfs: do not record witness order when creating vnode

When allocating new vnode, we need to lock it exclusively before
making it externally visible. Since other threads cannot observe the
vnode yet, current lock order cannot create LoR conditions.

Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D34126


# d51b0786 31-Jan-2022 Konstantin Belousov <kib@FreeBSD.org>

msdosfs_denode.c: some style

Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential revision: https://reviews.freebsd.org/D34126


# 41e85eea 25-Dec-2021 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: on integrity error, fire a task to remount filesystem to ro

In collaboration with: pho
Reviewed by: markj, mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33721


# 595ed4d7 23-Dec-2021 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: handle inconsistently hashed denodes

It is possible, on the corrupted msdosfs volume, to have file which
denode inode number does not match the one calculated using directory
cluster. Instead of asserting the condition as impossible, handle it
and return error, after reclaiming the aliased vnode.

In collaboration with: pho
Reviewed by: markj, mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33721


# 95d42526 01-Aug-2021 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: fix rename

Use the same locking algorithm for msdosfs_rename() as used by ufs_rename().
Convert doscheckpath() to non-sleeping version.

Reported by: trasz
PR: 257522
In collaboration with: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31464


# ae7e8a02 01-Aug-2021 Konstantin Belousov <kib@FreeBSD.org>

msdosfs deget(): add locking flags argument

LK_EXCLUSIVE must be passed always, some consumers need the ability to
specify LK_NOWAIT

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31464


# 2bfd8992 14-Feb-2021 Konstantin Belousov <kib@FreeBSD.org>

vnode: move write cluster support data to inodes.

The data is only needed by filesystems that
1. use buffer cache
2. utilize clustering write support.

Requested by: mjg
Reviewed by: asomers (previous version), fsu (ext2 parts), mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28679


# d2203b48 01-Jan-2020 Mateusz Guzik <mjg@FreeBSD.org>

msdos: vgone unconstructed vnode before vputing it

Otherwise someone else may race to start using it. Race window
was opened by r351748 ("vfs: implement usecount implying holdcnt").

Noted by: kib


# 5fdac752 20-Sep-2019 Kyle Evans <kevans@FreeBSD.org>

msdosfs: do not deget unlinked denodes

When a file is unlinked, the denode is not reclaimed until the last
reference is dropped, but the directory entry is immediately up for reuse.
This is a problem later when createde goes to grab a denode for the newly
created entry -- we search the hash and find a dead denode, then return that
without even bumping the reference count and the data later gets truncated
when the the last reference to the unlinked file is dropped.

This manifested itself as a broken in-place strip(1) on msdosfs. elfcopy
will do a sequence incredibly roughly like this:

open("/mnt/foo", ...) => fd 3
mmap()
unlink("/mnt/foo")
open("/mnt/foo", ...) => fd 4
write(4, ...)
close(4)
close(3)

and the resulting file would be truncated, but the write succeeded, as long
as a reference to the unlinked file had not been closed.

Some archaeology indicates that this bug has likely existed since msdosfs
was converted to use vfs_hash instead of a home rolled hash implementation
in r143570. Prior to that point, the hashget implementation would do a
refcnt check while searching and explicitly only return a denode with
de_refcnt != 0. vfs_hash did not yet have the callback that it does today,
so this slipped away and did not come back when it later grew that
functionality.

The comment indicating that we want to skip these denodes has been updated
to reflect where this is actually done. My repo-diving session seems to
indicate that the refcnt check was likely never actually below the comment,
to be pedantic, but instead a detail wrapped up in the hashget
implementation since the beginning of its inclusion into FreeBSD.

This bug was the cause behind the issue addressed in r352557.

Reported by: jhibbits
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21731


# 6470c8d3 29-Aug-2019 Konstantin Belousov <kib@FreeBSD.org>

Rework v_object lifecycle for vnodes.

Current implementation of vnode_create_vobject() and
vnode_destroy_vobject() is written so that it prepared to handle the
vm object destruction for live vnode. Practically, no filesystems use
this, except for some remnants that were present in UFS till today.
One of the consequences of that model is that each filesystem must
call vnode_destroy_vobject() in VOP_RECLAIM() or earlier, as result
all of them get rid of the v_object in reclaim.

Move the call to vnode_destroy_vobject() to vgonel() before
VOP_RECLAIM(). This makes v_object stable: either the object is NULL,
or it is valid vm object till the vnode reclamation. Remove code from
vnode_create_vobject() to handle races with the parallel destruction.

Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D21412


# 65417f5e 24-May-2019 Alan Somers <asomers@FreeBSD.org>

Remove "struct ucred*" argument from vtruncbuf

vtruncbuf takes a "struct ucred*" argument. AFAICT, it's been unused ever
since that function was first added in r34611. Remove it. Also, remove some
"struct ucred" arguments from fuse and nfs functions that were only used by
vtruncbuf.

Reviewed by: cem
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20377


# c973ee9e 03-Apr-2019 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: zero tail of the last block on truncation for VREG vnodes as well.

Despite the call to vtruncbuf() from detrunc(), which results in
zeroing part of the partial page after EOF, there still is a
possibility to retain the stale data which is revived on file
enlargement. If the filesystem block size is greater than the page
size, partial block might keep other after-EOF pages wired and they
get reused then. Fix it by zeroing whole part of the partial buffer
after EOF, not relying on vnode_pager_setsize().

PR: 236977
Reported by: asomers
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 195e6c50 30-Jul-2018 Ed Maste <emaste@FreeBSD.org>

msdosfs: trim EOL whitespace


# b732ceb6 06-May-2018 Pedro F. Giffuni <pfg@FreeBSD.org>

msdosfs: use vfs_timestamp() to generate timestamps instead of getnanotime().

Most filesystems, with the notable exceptions of msdosfs and autofs use
only vfs_timestamp() to read the current time. This has the benefit of
configurable granularity (using the vfs.timestamp_precision sysctl).

For convenience, use it on msdosfs too.

Submitted by: Damjan Jovanovic
Differential Revision: https://reviews.freebsd.org/D15297


# d63027b6 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/fs: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# 027bebe8 18-Oct-2017 Ed Maste <emaste@FreeBSD.org>

msdosfs: fix build with MSDOSFS_DEBUG

Inspired by a patch submission by longwitz@incore.de with many changes
for ino64 in HEAD.

PR: 199152
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation


# 6a1c2e1f 02-Jun-2017 Ed Maste <emaste@FreeBSD.org>

msdosfs: use mem{cpy,move,set} instead of bcopy,bzero

This somewhat simplifies use of msdosfs code in userland (for makefs),
reduces diffs with NetBSD and is standard C as of C89.

Reviewed by: imp
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D11014


# 9287dbaa 21-May-2017 Ed Maste <emaste@FreeBSD.org>

msdosfs: capitalize FAT appropriately

Diff reduction with NetBSD, including some nearby minor whitespace or
style fixes.

Obtained from: NetBSD
Sponsored by: The FreeBSD Foundation


# 9ed01c32 17-Apr-2017 Gleb Smirnoff <glebius@FreeBSD.org>

All these files need sys/vmmeter.h, but now they got it implicitly
included via sys/pcpu.h.


# 10c9700f 09-Jan-2015 Ed Maste <emaste@FreeBSD.org>

ANSIfy sys/fs/msdosfs

There are a number of msdosfs improvements in NetBSD that may be worth
bringing over, and this reduces noise in the comparison.

Differential Revision: https://reviews.freebsd.org/D1466
Reviewed by: kib
Sponsored by: The FreeBSD Foundation


# 7da1a731 21-Aug-2013 Kenneth D. Merry <ken@FreeBSD.org>

Expand the use of stat(2) flags to allow storing some Windows/DOS
and CIFS file attributes as BSD stat(2) flags.

This work is intended to be compatible with ZFS, the Solaris CIFS
server's interaction with ZFS, somewhat compatible with MacOS X,
and of course compatible with Windows.

The Windows attributes that are implemented were chosen based on
the attributes that ZFS already supports.

The summary of the flags is as follows:

UF_SYSTEM: Command line name: "system" or "usystem"
ZFS name: XAT_SYSTEM, ZFS_SYSTEM
Windows: FILE_ATTRIBUTE_SYSTEM

This flag means that the file is used by the
operating system. FreeBSD does not enforce any
special handling when this flag is set.

UF_SPARSE: Command line name: "sparse" or "usparse"
ZFS name: XAT_SPARSE, ZFS_SPARSE
Windows: FILE_ATTRIBUTE_SPARSE_FILE

This flag means that the file is sparse. Although
ZFS may modify this in some situations, there is
not generally any special handling for this flag.

UF_OFFLINE: Command line name: "offline" or "uoffline"
ZFS name: XAT_OFFLINE, ZFS_OFFLINE
Windows: FILE_ATTRIBUTE_OFFLINE

This flag means that the file has been moved to
offline storage. FreeBSD does not have any special
handling for this flag.

UF_REPARSE: Command line name: "reparse" or "ureparse"
ZFS name: XAT_REPARSE, ZFS_REPARSE
Windows: FILE_ATTRIBUTE_REPARSE_POINT

This flag means that the file is a Windows reparse
point. ZFS has special handling code for reparse
points, but we don't currently have the other
supporting infrastructure for them.

UF_HIDDEN: Command line name: "hidden" or "uhidden"
ZFS name: XAT_HIDDEN, ZFS_HIDDEN
Windows: FILE_ATTRIBUTE_HIDDEN

This flag means that the file may be excluded from
a directory listing if the application honors it.
FreeBSD has no special handling for this flag.

The name and bit definition for UF_HIDDEN are
identical to the definition in MacOS X.

UF_READONLY: Command line name: "urdonly", "rdonly", "readonly"
ZFS name: XAT_READONLY, ZFS_READONLY
Windows: FILE_ATTRIBUTE_READONLY

This flag means that the file may not written or
appended, but its attributes may be changed.

ZFS currently enforces this flag, but Illumos
developers have discussed disabling enforcement.

The behavior of this flag is different than MacOS X.
MacOS X uses UF_IMMUTABLE to represent the DOS
readonly permission, but that flag has a stronger
meaning than the semantics of DOS readonly permissions.

UF_ARCHIVE: Command line name: "uarch", "uarchive"
ZFS_NAME: XAT_ARCHIVE, ZFS_ARCHIVE
Windows name: FILE_ATTRIBUTE_ARCHIVE

The UF_ARCHIVED flag means that the file has changed and
needs to be archived. The meaning is same as
the Windows FILE_ATTRIBUTE_ARCHIVE attribute, and
the ZFS XAT_ARCHIVE and ZFS_ARCHIVE attribute.

msdosfs and ZFS have special handling for this flag.
i.e. they will set it when the file changes.

sys/param.h: Bump __FreeBSD_version to 1000047 for the
addition of new stat(2) flags.

chflags.1: Document the new command line flag names
(e.g. "system", "hidden") available to the
user.

ls.1: Reference chflags(1) for a list of file flags
and their meanings.

strtofflags.c: Implement the mapping between the new
command line flag names and new stat(2)
flags.

chflags.2: Document all of the new stat(2) flags, and
explain the intended behavior in a little
more detail. Explain how they map to
Windows file attributes.

Different filesystems behave differently
with respect to flags, so warn the
application developer to take care when
using them.

zfs_vnops.c: Add support for getting and setting the
UF_ARCHIVE, UF_READONLY, UF_SYSTEM, UF_HIDDEN,
UF_REPARSE, UF_OFFLINE, and UF_SPARSE flags.

All of these flags are implemented using
attributes that ZFS already supports, so
the on-disk format has not changed.

ZFS currently doesn't allow setting the
UF_REPARSE flag, and we don't really have
the other infrastructure to support reparse
points.

msdosfs_denode.c,
msdosfs_vnops.c: Add support for getting and setting
UF_HIDDEN, UF_SYSTEM and UF_READONLY
in MSDOSFS.

It supported SF_ARCHIVED, but this has been
changed to be UF_ARCHIVE, which has the same
semantics as the DOS archive attribute instead
of inverse semantics like SF_ARCHIVED.

After discussion with Bruce Evans, change
several things in the msdosfs behavior:

Use UF_READONLY to indicate whether a file
is writeable instead of file permissions, but
don't actually enforce it.

Refuse to change attributes on the root
directory, because it is special in FAT
filesystems, but allow most other attribute
changes on directories.

Don't set the archive attribute on a directory
when its modification time is updated.
Windows and DOS don't set the archive attribute
in that scenario, so we are now bug-for-bug
compatible.

smbfs_node.c,
smbfs_vnops.c: Add support for UF_HIDDEN, UF_SYSTEM,
UF_READONLY and UF_ARCHIVE in SMBFS.

This is similar to changes that Apple has
made in their version of SMBFS (as of
smb-583.8, posted on opensource.apple.com),
but not quite the same.

We map SMB_FA_READONLY to UF_READONLY,
because UF_READONLY is intended to match
the semantics of the DOS readonly flag.
The MacOS X code maps both UF_IMMUTABLE
and SF_IMMUTABLE to SMB_FA_READONLY, but
the immutable flags have stronger meaning
than the DOS readonly bit.

stat.h: Add definitions for UF_SYSTEM, UF_SPARSE,
UF_OFFLINE, UF_REPARSE, UF_ARCHIVE, UF_READONLY
and UF_HIDDEN.

The definition of UF_HIDDEN is the same as
the MacOS X definition.

Add commented-out definitions of
UF_COMPRESSED and UF_TRACKED. They are
defined in MacOS X (as of 10.8.2), but we
do not implement them (yet).

ufs_vnops.c: Add support for getting and setting
UF_ARCHIVE, UF_HIDDEN, UF_OFFLINE, UF_READONLY,
UF_REPARSE, UF_SPARSE, and UF_SYSTEM in UFS.
Alphabetize the flags that are supported.

These new flags are only stored, UFS does
not take any action if the flag is set.

Sponsored by: Spectra Logic
Reviewed by: bde (earlier version)


# 293e4eb6 02-May-2013 Konstantin Belousov <kib@FreeBSD.org>

The fsync(2) call should sync the vnode in such way that even after
system crash which happen after successfull fsync() return, the data
is accessible. For msdosfs, this means that FAT entries for the file
must be written.

Since we do not track the FAT blocks containing entries for the
current file, just do a sloppy sync of the devvp vnode for the mount,
which buffers, among other things, contain FAT blocks.

Simultaneously, for deupdat():
- optimize by clearing the modified flags before short-circuiting a
return, if the mount is read-only;
- only ignore the rest of the function for denode with DE_MODIFIED
flag clear when the waitfor argument is false. The directory buffer
for the entry might be of delayed write;
- microoptimize by comparing the updated directory entry with the
current block content;
- try to cluster the write, fall back to bawrite() if low on
resources.

Based on the submission by: bde
MFC after: 2 weeks


# c6e0355c 19-Nov-2012 Attilio Rao <attilio@FreeBSD.org>

r16312 is not any longer real since many years (likely since when VFS
received granular locking) but the comment present in UFS has been
copied all over other filesystems code incorrectly for several times.

Removes comments that makes no sense now.

Reviewed by: kib
MFC after: 3 days


# af6e6b87 23-Apr-2012 Edward Tomasz Napierala <trasz@FreeBSD.org>

Remove unused thread argument to vrecycle().

Reviewed by: kib


# c52fd858 23-Apr-2012 Edward Tomasz Napierala <trasz@FreeBSD.org>

Remove unused thread argument from vtruncbuf().

Reviewed by: kib


# 2aacee77 22-Feb-2012 Konstantin Belousov <kib@FreeBSD.org>

Use DOINGASYNC() to test for async allowance, to honor VFS syncing requests.

Noted by: bde
MFC after: 1 week


# 54cf9198 22-Nov-2011 Konstantin Belousov <kib@FreeBSD.org>

Put all the messages from msdosfs under the MSDOSFS_DEBUG ifdef.
They are confusing to user, and not informative for general consumption.

MFC after: 1 week


# 730b63b0 19-Nov-2010 Konstantin Belousov <kib@FreeBSD.org>

Remove prtactive variable and related printf()s in the vop_inactive
and vop_reclaim() methods. They seems to be unused, and the reported
situation is normal for the forced unmount.

MFC after: 1 week
X-MFC-note: keep prtactive symbol in vfs_subr.c


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 0b53cc9f 13-Oct-2010 Rui Paulo <rpaulo@FreeBSD.org>

Ignore the return value of DE_INTERNALIZE().


# 670baf9d 24-Mar-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r204469:
In msdosfs deget(), properly handle the case when the vnode is found in hash.


# ac8743bc 24-Mar-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r204468:
In msdosfs_inactive(), reclaim the vnodes both for SLOT_DELETED and
SLOT_EMPTY deName[0] values.


# 1a07617f 24-Mar-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r204466:
Assert that the msdosfs vnode is (e)locked in several places.
Change the check and return on impossible condition into KASSERT().


# 0bdbd6270 28-Feb-2010 Konstantin Belousov <kib@FreeBSD.org>

In msdosfs deget(), properly handle the case when the vnode is found in hash.

Tested by: pho
MFC after: 3 weeks


# 740a7201 28-Feb-2010 Konstantin Belousov <kib@FreeBSD.org>

In msdosfs_inactive(), reclaim the vnodes both for SLOT_DELETED and
SLOT_EMPTY deName[0] values. Besides conforming to FAT specification, it
also clears the issue where vfs_hash_insert found the vnode in hash, and
newly allocated vnode is vput()ed. There, deName[0] == 0, and vnode is
not reclaimed, indefinitely kept on mountlist.

Tested by: pho
MFC after: 3 weeks


# ef6a2be3 28-Feb-2010 Konstantin Belousov <kib@FreeBSD.org>

Assert that the msdosfs vnode is (e)locked in several places.
The plan is to use vnode lock to protect denode and fat cache,
and having separate lock for block use map.

Change the check and return on impossible condition into KASSERT().

Tested by: pho
MFC after: 3 weeks


# fe5e7466 20-Feb-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r203826:
Use M_ZERO instead of calling bzero().
Fix function name in the comment.

MFC r203828:
Fix function name in the comment in the second location too.


# 699d124f 12-Feb-2010 Konstantin Belousov <kib@FreeBSD.org>

Fix function name in the comment in the second location too.

Submitted by: ed
MFC after: 1 week


# 8b36e813 12-Feb-2010 Konstantin Belousov <kib@FreeBSD.org>

Use M_ZERO instead of calling bzero().
Fix function name in the comment.

MFC after: 1 week


# c72ae142 27-Feb-2009 John Baldwin <jhb@FreeBSD.org>

- Hold a reference on the cdev a filesystem is mounted from in the mount.
- Remove the cdev pointers from the denode and instead use the mountpoint's
reference to call dev2udev() in getattr().

Reviewed by: kib, julian


# f99f675d 11-Jan-2009 Edward Tomasz Napierala <trasz@FreeBSD.org>

Fix msdosfs_print(), which in turn fixes "show lockedvnods" for msdosfs
vnodes.

Reviewed by: kib
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation


# 1ede983c 23-Oct-2008 Dag-Erling Smørgrav <des@FreeBSD.org>

Retire the MALLOC and FREE macros. They are an abomination unto style(9).

MFC after: 3 months


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 0e9eb108 23-Jan-2008 Attilio Rao <attilio@FreeBSD.org>

Cleanup lockmgr interface and exported KPI:
- Remove the "thread" argument from the lockmgr() function as it is
always curthread now
- Axe lockcount() function as it is no longer used
- Axe LOCKMGR_ASSERT() as it is bogus really and no currently used.
Hopefully this will be soonly replaced by something suitable for it.
- Remove the prototype for dumplockinfo() as the function is no longer
present

Addictionally:
- Introduce a KASSERT() in lockstatus() in order to let it accept only
curthread or NULL as they should only be passed
- Do a little bit of style(9) cleanup on lockmgr.h

KPI results heavilly broken by this change, so manpages and
FreeBSD_version will be modified accordingly by further commits.

Tested by: matteo


# cb65c1ee 18-Oct-2007 Bruce Evans <bde@FreeBSD.org>

Implement the async (really, delayed-write) mount option for msdosfs.

This is much simpler than for ffs since there are many fewer places
where we need to choose between a delayed write and a sync write --
just 5 in msdosfs and more than 30 in ffs.

This is more complete and correct than in ffs. Several places in ffs
are are still missing the choice. ffs_update() has a layering violation
that breaks callers which want to force a sync update (mainly fsync(2)
and O_SYNC write(2)).

However, fsync(2) and O_SYNC write(2) are still more broken than in
ffs, since they are broken for default (non-sync non-async) mounts
too. Both fail to sync the FAT in all cases, and both fail to sync
the directory entry in some cases after losing a race. Async everything
is probably safer than the half-baked sync of metadata given by default
mounts.


# e3117f85 06-Aug-2007 Bruce Evans <bde@FreeBSD.org>

Fix some style bugs (don't assume that off_t == int64_t; fix some comments;
remove some parentheses; fix some whitespace errors; fix only one case of
a boolean comparison of a non-boolean).

Improve an error message by quoting ".", and by not printing large positive
values as negative ones.

Approved by: re (kensmith) (blanket)


# a878a31c 06-Aug-2007 Bruce Evans <bde@FreeBSD.org>

Remove unused include(s).

Approved by: re (kensmith) (blanket)


# 6fd81fc7 06-Aug-2007 Bruce Evans <bde@FreeBSD.org>

Remove unused include(s).

Approved by: re (kensmith) (blanket)


# 61b9d89f 12-Mar-2007 Tor Egge <tegge@FreeBSD.org>

Make insmntque() externally visibile and allow it to fail (e.g. during
late stages of unmount). On failure, the vnode is recycled.

Add insmntque1(), to allow for file system specific cleanup when
recycling vnode on failure.

Change getnewvnode() to no longer call insmntque(). Previously,
embryonic vnodes were put onto the list of vnode belonging to a file
system, which is unsafe for a file system marked MPSAFE.

Change vfs_hash_insert() to no longer lock the vnode. The caller now
has that responsibility.

Change most file systems to lock the vnode and call insmntque() or
insmntque1() after a new vnode has been sufficiently setup. Handle
failed insmntque*() calls by propagating errors to callers, possibly
after some file system specific cleanup.

Approved by: re (kensmith)
Reviewed by: kib
In collaboration with: kib


# 3c960d93 24-Oct-2006 Poul-Henning Kamp <phk@FreeBSD.org>

Replace slightly crummy fattime<->timespec conversion functions.


# 92e73f57 17-Jan-2006 Alfred Perlstein <alfred@FreeBSD.org>

I ran into an nfs client panic a couple of times in a row over the
last few days. I tracked it down to the fact that nfs_reclaim()
is setting vp->v_data to NULL _before_ calling vnode_destroy_object().
After silence from the mailing list I checked further and discovered
that ufs_reclaim() is unique among FreeBSD filesystems for calling
vnode_destroy_object() early, long before tossing v_data or much
of anything else, for that matter. The rest, including NFS, appear
to be identical, as if they were just clones of one original routine.

The enclosed patch fixes all file systems in essentially the same
way, by moving the call to vnode_destroy_object() to early in the
routine (before the call to vfs_hash_remove(), if any). I have
only tested NFS, but I've now run for over eighteen hours with the
patch where I wouldn't get past four or five without it.

Submitted by: Frank Mayhar
Requested by: Mohan Srinivasan
MFC After: 1 week


# 5bb84bc8 31-Oct-2005 Robert Watson <rwatson@FreeBSD.org>

Normalize a significant number of kernel malloc type names:

- Prefer '_' to ' ', as it results in more easily parsed results in
memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion. Similar changes are required for UMA zone names.


# 5ddf2985 07-Sep-2005 David E. O'Brien <obrien@FreeBSD.org>

Ensure the full value is written into inode variables.

PR: 85503
Submitted by: Dmitry Pryanishnikov <dmitry@atlantis.dp.ua>


# f4b423ae 07-Apr-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Give msdosfs a unique inode number which is really the byteoffset of
the directory entry.

This solves the corruption problem I belive.

Regression test script by: silby


# 7d2832e6 25-Mar-2005 Jeff Roberson <jeff@FreeBSD.org>

- Pass LK_EXCLUSIVE as the lock type to vget in vfs_hash_insert().


# 51f5ce0c 16-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Add two arguments to the vfs_hash() KPI so that filesystems which do
not have unique hashes (NFS) can also use it.


# 3b97f388 15-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Eliminate cdev pointer in inodes, they're not used or needed.

The cdev could have been pulled out of the mountpoint cheaper back
when it was used anyway.


# 45c26fa2 15-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Improve the vfs_hash() API: vput() the unneeded vnode centrally to
avoid replicating the vput in all the filesystems.


# e82ef95c 15-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Simplify the vfs_hash calling convention.


# a30fc63b 13-Mar-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Use vfs_hash instead of home-rolling.


# 8da00465 12-Mar-2005 Jeff Roberson <jeff@FreeBSD.org>

- The VI_DOOMED flag now signals the end of a vnode's relationship with
the filesystem. Check that rather than VI_XLOCK.
- VOP_INACTIVE should no longer drop the vnode lock.
- The vnode lock is required around calls to vrecycle() and vgone().

Sponsored by: Isilon Systems, Inc.


# a369f34d 28-Jan-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Make filesystems get rid of their own vnodes vnode_pager object in
VOP_RECLAIM().


# d4eb29ba 28-Jan-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unused argument to vrecycle()


# d167cf6f 06-Jan-2005 Warner Losh <imp@FreeBSD.org>

/* -> /*- for copyright notices, minor format tweaks as necessary


# 4b440374 02-Dec-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove the de_devvp and stop VREF'ing it for every vnode we create.


# aec0fb7b 01-Dec-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Back when VOP_* was introduced, we did not have new-style struct
initializations but we did have lofty goals and big ideals.

Adjust to more contemporary circumstances and gain type checking.

Replace the entire vop_t frobbing thing with properly typed
structures. The only casualty is that we can not add a new
VOP_ method with a loadable module. History has not given
us reason to belive this would ever be feasible in the the
first place.

Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.

Give coda correct prototypes and function definitions for
all vop_()s.

Generate a bit more data from the vnode_if.src file: a
struct vop_vector and protype typedefs for all vop methods.

Add a new vop_bypass() and make vop_default be a pointer
to another struct vop_vector.

Remove a lot of vfs_init since vop_vector is ready to use
from the compiler.

Cast various vop_mumble() to void * with uppercase name,
for instance VOP_PANIC, VOP_NULL etc.

Implement VCALL() by making vdesc_offset the offsetof() the
relevant function pointer in vop_vector. This is disgusting
but since the code is generated by a script comparatively
safe. The alternative for nullfs etc. would be much worse.

Fix up all vnode method vectors to remove casts so they
become typesafe. (The bulk of this is generated by scripts)


# 9a135592 29-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Move MSDOSFS to GEOM backing instead of DEVFS.

For details, please see src/sys/ufs/ffs/ffs_vfsops.c 1.250.


# 1affa3ad 07-Sep-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Create simple function init_va_filerev() for initializing a va_filerev
field.

Replace three instances of longhaired initialization va_filerev fields.

Added XXX comment wondering why we don't use random bits instead of
uptime of the system for this purpose.


# 89c9c53d 16-Jun-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


# 9c695a26 04-Oct-2003 Jeff Roberson <jeff@FreeBSD.org>

- Don't cache_purge() in *_reclaim routines. vclean() does it for us so
this is redundant.


# c2f95f66 27-Jun-2003 Tom Rhodes <trhodes@FreeBSD.org>

Fix a bug where a truncate operation involving truncate() or ftruncate() on
an MSDOSFS file system either failed, silently corrupted the file, or
sometimes corrupted the neighboring file.

PR: 53695
Submitted by: Ariff Abdullah <skywizard@MyBSD.org.my> (original version)
MFC: 3 days


# 67af3913 09-Apr-2003 Warner Losh <imp@FreeBSD.org>

It appears that msdosfs_init() is called multiple times. This happens
on my system where I preload msdosfs and have it in my kernel.
There's likely another bug that's causing msdosfs_init() to be called
multiple times, but this makes that harmless.


# a163d034 18-Feb-2003 Warner Losh <imp@FreeBSD.org>

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 44956c98 21-Jan-2003 Alfred Perlstein <alfred@FreeBSD.org>

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# a5b65058 13-Oct-2002 Kirk McKusick <mckusick@FreeBSD.org>

Regularize the vop_stdlock'ing protocol across all the filesystems
that use it. Specifically, vop_stdlock uses the lock pointed to by
vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to
reference vp->v_lock. Filesystems that wish to use the default
do not need to allocate a lock at the front of their node structure
(as some still did) or do a lockinit. They can simply start using
vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks,
but still use the vop_stdlock functions (such as nullfs) can simply
replace vp->v_vnlock with a pointer to the lock that they wish to
have used for the vnode. Such filesystems are responsible for
setting the vp->v_vnlock back to the default in their vop_reclaim
routine (e.g., vp->v_vnlock = &vp->v_lock).

In theory, this set of changes cleans up the existing filesystem
lock interface and should have no function change to the existing
locking scheme.

Sponsored by: DARPA & NAI Labs.


# 4d93c0be 24-Sep-2002 Jeff Roberson <jeff@FreeBSD.org>

- Use vrefcnt() where it is safe to do so instead of doing direct and
unlocked accesses to v_usecount.
- Lock access to the buf lists in the various sync routines. interlock
locking could be avoided almost entirely in leaf filesystems if the
fsync function had a generic helper.


# 06be2aaa 14-Sep-2002 Nate Lawson <njl@FreeBSD.org>

Remove all use of vnode->v_tag, replacing with appropriate substitutes.
v_tag is now const char * and should only be used for debugging.

Additionally:
1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK
2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which
is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.

Suggested by: phk
Reviewed by: bde, rwatson (earlier version)


# b5cdbc6d 15-Aug-2002 Tom Rhodes <trhodes@FreeBSD.org>

When a cluster entry for ``.'' is set to 0, msdosfs fails to handle it
correctly.

PR: 24393
Submitted by: semenu
Approved by: rwatson (mentor)
MFC after: 1 week


# e6e370a7 04-Aug-2002 Jeff Roberson <jeff@FreeBSD.org>

- Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
with VOP calls is needed.
- v_iflag is protected by interlock and is used for dealing with vnode
management issues. These flags include X/O LOCK, FREE, DOOMED, etc.
- All accesses to v_iflag and v_vflag have either been locked or marked with
mp_fixme's.
- Many ASSERT_VOP_LOCKED calls have been added where the locking was not
clear.
- Many functions in vfs_subr.c were restructured to provide for stronger
locking.

Idea stolen from: BSD/OS


# 6008862b 04-Apr-2002 John Baldwin <jhb@FreeBSD.org>

Change callers of mtx_init() to pass in an appropriate lock type name. In
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.

Tested on: i386, alpha, sparc64


# 11caded3 19-Mar-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove __P.


# 23b59018 20-Dec-2001 Matthew Dillon <dillon@FreeBSD.org>

Fix a BUF_TIMELOCK race against BUF_LOCK and fix a deadlock in vget()
against VM_WAIT in the pageout code. Both fixes involve adjusting
the lockmgr's timeout capability so locks obtained with timeouts do not
interfere with locks obtained without a timeout.

Hopefully MFC: before the 4.5 release


# b40ce416 12-Sep-2001 Julian Elischer <julian@FreeBSD.org>

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 1166fb51 25-May-2001 Ruslan Ermilov <ru@FreeBSD.org>

- sys/msdosfs moved to sys/fs/msdosfs
- msdos.ko renamed to msdosfs.ko
- /usr/include/msdosfs moved to /usr/include/fs/msdosfs


# 60fb0ce3 28-Apr-2001 Greg Lehey <grog@FreeBSD.org>

Revert consequences of changes to mount.h, part 2.

Requested by: bde


# d98dc34f 23-Apr-2001 Greg Lehey <grog@FreeBSD.org>

Correct #includes to work with fixed sys/mount.h.


# 9ed346ba 08-Feb-2001 Bosko Milekic <bmilekic@FreeBSD.org>

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


# 1b367556 23-Jan-2001 Jason Evans <jasone@FreeBSD.org>

Convert all simplelocks to mutexes and remove the simplelock implementations.


# 9f69a457 29-Oct-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Weaken a bogus dependency on <sys/proc.h> in <sys/buf.h> by #ifdef'ing
the offending inline function (BUF_KERNPROC) on it being #included
already.

I'm not sure BUF_KERNPROC() is even the right thing to do or in the
right place or implemented the right way (inline vs normal function).

Remove consequently unneeded #includes of <sys/proc.h>


# e7b1ac75 22-Oct-2000 Boris Popov <bp@FreeBSD.org>

Remove de_lock field from denode structure and make msdosfs PDIRUNLOCK aware.


# a18b1f1d 03-Oct-2000 Jason Evans <jasone@FreeBSD.org>

Convert lockmgr locks from using simple locks to using mutexes.

Add lockdestroy() and appropriate invocations, which corresponds to
lockinit() and must be called to clean up after a lockmgr lock is no
longer needed.


# 432a8400 28-Jun-2000 Boris Popov <bp@FreeBSD.org>

Fix memory leakage on module unload.

Spotted by: fixed INVARIANTS code


# 9626b608 05-May-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by: peter


# 01f6cfba 27-Jan-2000 Yoshihiro Takahashi <nyan@FreeBSD.org>

Supported non-512 bytes/sector format.

PR: misc/12992
Submitted by: chi@bd.mbn.or.jp (Chiharu Shibata) and
Dmitrij Tejblum <tejblum@arc.hq.cti.ru>
Reviewed by: Dmitrij Tejblum <tejblum@arc.hq.cti.ru>


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# bfbb9ce6 11-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.

Provide functions to manipulate both types:
major() umajor()
minor() uminor()
makedev() umakedev()
dev2udev() udev2dev()

For now they're functions, they will become in-line functions
after one of the next two steps in this process.

Return major/minor/makedev to macro-hood for userland.

Register a name in cdevsw[] for the "filedescriptor" driver.

In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.

In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits. This will essentially
replace the current "alias" check code (same buck, bigger bang).

A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference. If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.

Without DEVT_FASCIST I belive this patch is a no-op.

Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.

Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).


# 289bdf33 02-Jan-1999 Bruce Evans <bde@FreeBSD.org>

Ifdefed conditionally used simplock variables.


# f1d19042 07-Dec-1998 Archie Cobbs <archie@FreeBSD.org>

The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static
and local variables, goto labels, and functions declared but not defined.


# bad3d41d 20-Nov-1998 Dmitrij Tejblum <dt@FreeBSD.org>

Support NT VFAT lower case flags.

PR: 8383
(Mostly) Submitted by: Carl Mascott <cmascott@world.std.com>


# 1c5bb3ea 10-Nov-1998 Peter Wemm <peter@FreeBSD.org>

add #include <sys/kernel.h> where it's needed by MALLOC_DEFINE()


# 44fdad99 29-Oct-1998 Peter Wemm <peter@FreeBSD.org>

Use vtruncbuf() rather than vinvalbuf() when shortening files.


# 0492d857 17-Aug-1998 Bruce Evans <bde@FreeBSD.org>

Removed unused includes.


# 5097a20d 17-May-1998 Bruce Evans <bde@FreeBSD.org>

Don't forget to clean up after an error reading the directory entry
in deget().


# 84de3fb7 17-May-1998 Bruce Evans <bde@FreeBSD.org>

Removed vestiges of pre-Lite2 locking.


# c21410e1 17-May-1998 Poul-Henning Kamp <phk@FreeBSD.org>

s/nanoruntime/nanouptime/g
s/microruntime/microuptime/g

Reviewed by: bde


# 00af9731 04-Apr-1998 Poul-Henning Kamp <phk@FreeBSD.org>

Time changes mark 2:

* Figure out UTC relative to boottime. Four new functions provide
time relative to boottime.

* move "runtime" into struct proc. This helps fix the calcru()
problem in SMP.

* kill mono_time.

* add timespec{add|sub|cmp} macros to time.h. (XXX: These may change!)

* nanosleep, select & poll takes long sleeps one day at a time

Reviewed by: bde
Tested by: ache and others


# a0502b19 26-Mar-1998 Poul-Henning Kamp <phk@FreeBSD.org>

Add two new functions, get{micro|nano}time.

They are atomic, but return in essence what is in the "time" variable.
gettime() is now a macro front for getmicrotime().

Various patches to use the two new functions instead of the various
hacks used in their absence.

Some puntuation and grammer patches from Bruce.

A couple of XXX comments.


# a8e44116 19-Mar-1998 KATO Takenori <kato@FreeBSD.org>

Deleted 1024bytes/sector floppy code for PC-98 arch. The
1024bytes/sector code has not worked for long time and it should be
re-implemented.


# 952a6212 18-Feb-1998 Jordan K. Hubbard <jkh@FreeBSD.org>

Update MSDOSFS code using NetBSD's msdosfs as a guide to support
FAT32 partitions. Unfortunately, we looked around here at
Walnut Creek CDROM for any newer FAT32-supporting versions
of Win95 and we were unsuccessful; only the older stuff here.
So this is untested beyond simply making sure it compiles and
someone with access to an actual FAT32 fs will have
to let us know how well it actually works.
Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
Obtained from: NetBSD


# 303b270b 08-Feb-1998 Eivind Eklund <eivind@FreeBSD.org>

Staticize.


# 0b08f5f7 05-Feb-1998 Eivind Eklund <eivind@FreeBSD.org>

Back out DIAGNOSTIC changes.


# 47cfdb16 04-Feb-1998 Eivind Eklund <eivind@FreeBSD.org>

Turn DIAGNOSTIC into a new-style option.


# a1c995b6 12-Oct-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Last major round (Unless Bruce thinks of somthing :-) of malloc changes.

Distribute all but the most fundamental malloc types. This time I also
remembered the trick to making things static: Put "static" in front of
them.

A couple of finer points by: bde


# 55166637 11-Oct-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Distribute and statizice a lot of the malloc M_* types.

Substantial input from: bde


# e4ba6a82 02-Sep-1997 Bruce Evans <bde@FreeBSD.org>

Removed unused #includes.


# a5db4bf4 25-Aug-1997 John Dyson <dyson@FreeBSD.org>

Back out some incorrect changes that was worse than the original bug.


# 89721f6f 21-Aug-1997 John Dyson <dyson@FreeBSD.org>

This is a trial improvement for the vnode reference count while on the vnode
free list problem. Also, the vnode age flag is no longer used by the
vnode pager. (It is actually incorrect to use then.) Constructive
feedback welcome -- just be kind.


# af3f60d5 26-Feb-1997 Bruce Evans <bde@FreeBSD.org>

Updated msdosfs to use Lite2 vfs configuration and Lite2 locking. It
should now work as (un)well as before the Lite2 merge.


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 8092a8e1 12-Feb-1997 Mike Pritchard <mpp@FreeBSD.org>

Make this compile without warnings after the Lite2 merge:

- *fs_init routines now take a "struct vfsconf * vfsp" pointer
as an argument.
- Use the correct type for cookies.
- Update function prototypes.

Submitted by: bde


# 996c772f 09-Feb-1997 John Dyson <dyson@FreeBSD.org>

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# ad63a118 14-Jun-1996 Satoshi Asami <asami@FreeBSD.org>

The Great PC98 Merge.

All new code is "#ifdef PC98"ed so this should make no difference to
PC/AT (and its clones) users.

Ok'd by: core
Submitted by: FreeBSD(98) development team


# 2f9bae59 11-Jun-1996 David Greenman <dg@FreeBSD.org>

Moved the fsnode MALLOC to before the call to getnewvnode() so that the
process won't possibly block before filling in the fsnode pointer (v_data)
which might be dereferenced during a sync since the vnode is put on the
mnt_vnodelist by getnewvnode.

Pointed out by Matt Day <mday@artisoft.com>


# bd7e5f99 18-Jan-1996 John Dyson <dyson@FreeBSD.org>

Eliminated many redundant vm_map_lookup operations for vm_mmap.
Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish
overhead for merged cache.
Efficiency improvement for vfs_cluster. It used to do alot of redundant
calls to cluster_rbuild.
Correct the ordering for vrele of .text and release of credentials.
Use the selective tlb update for 486/586/P6.
Numerous fixes to the size of objects allocated for files. Additionally,
fixes in the various pagers.
Fixes for proper positioning of vnode_pager_setsize in msdosfs and ext2fs.
Fixes in the swap pager for exhausted resources. The pageout code
will not as readily thrash.
Change the page queue flags (PG_ACTIVE, PG_INACTIVE, PG_FREE, PG_CACHE) into
page queue indices (PQ_ACTIVE, PQ_INACTIVE, PQ_FREE, PQ_CACHE),
thereby improving efficiency of several routines.
Eliminate even more unnecessary vm_page_protect operations.
Significantly speed up process forks.
Make vm_object_page_clean more efficient, thereby eliminating the pause
that happens every 30seconds.
Make sequential clustered writes B_ASYNC instead of B_DELWRI even in the
case of filesystems mounted async.
Fix a panic with busy pages when write clustering is done for non-VMIO
buffers.


# efeaf95a 06-Dec-1995 David Greenman <dg@FreeBSD.org>

Untangled the vm.h include file spaghetti.


# 58c27bcf 03-Dec-1995 Bruce Evans <bde@FreeBSD.org>

Added prototypes.


# 94a8606f 02-Aug-1995 Doug Rabson <dfr@FreeBSD.org>

Add support for the va_filerev attribute required by NFSv3.


# 9b2e5354 30-May-1995 Rodney W. Grimes <rgrimes@FreeBSD.org>

Remove trailing whitespace.


# 2360b9bf 11-Apr-1995 Bruce Evans <bde@FreeBSD.org>

Submitted by: Mike Pritchard <pritc003@maroon.tc.umn.edu>

Fix PR 303: msdosfs: moving a file into another directory causes panic.

" ... the code that does the rename already has the denode
locked when msdosfs_hashins() gets called, resulting in the panic
when the routine attempts to lock the denode again.
...
The attached patch changes the msdosfs_hashins() routine to not lock the
denode. The caller is now resposible for obtaining the lock instead
of having msdosfs_hashins() do it for them."


# 0f52ba35 18-Mar-1995 David Greenman <dg@FreeBSD.org>

Removed bogus, commented out, call to vnode_pager_uncache().


# b5e8ce9f 16-Mar-1995 Bruce Evans <bde@FreeBSD.org>

Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
ones.


# afb7bd3e 28-Jan-1995 Andreas Schulz <ats@FreeBSD.org>

Kill the comment in a comment to shut up the compiler.


# 0d94caff 09-Jan-1995 David Greenman <dg@FreeBSD.org>

These changes embody the support of the fully coherent merged VM buffer cache,
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.

The majority of the merged VM/cache work is by John Dyson.

The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.

vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme. The scheme is almost fully compatible with the old filesystem
interface. Significant improvement in the number of opportunities for write
clustering.

vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.

vm_object.c:
Yet more improvements in the collapse code. Elimination of some windows that
can cause list corruption.

vm_pageout.c:
Fixed it, it really works better now. Somehow in 2.0, some "enhancements"
broke the code. This code has been reworked from the ground-up.

vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.

pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.

vm_glue.c
Much simpler and more effective swapping code. No more gratuitous swapping.

proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.

swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the
code doesn't need it anymore.

machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.

machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.

ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on
busy buffers.

Submitted by: John Dyson and David Greenman


# a176f73d 26-Dec-1994 Bruce Evans <bde@FreeBSD.org>

Fix panic for `cp -p' by root to an msdos file system. Improve handling
of attributes so that `cp -p' to an msdos file system can succeed under
favourable circumstances (no uid or gid changes and no nonzero flags
except SF_ARCHIVED).

msdosfs_vnops.c:
The in-core inode flags were confused with the on-disk inode flags, so
chflags() clobbered the lock flag and caused a panic.

denode.h, msdosfs_denode.c, msdosfs_vnops.c:
Support the msdosfs archive attibute (ATTR_ARCHIVE) by mapping it to
the complement of the SF_ARCHIVED flag and setting the ATTR_ARCHIVE
bit when a file's modification time is set (but not when a file's
permissions are set; this is the standard wrong DOS behaviour).

denode.h, msdosfs_denode.c:
Remove the DE_UPDAT() macro. It was only used once, and the corresponding
macro in ufs has already been removed.

denode.h:
Don't change the timestamp for directories in DE_TIMES() (be consistent
with deupdat()).

msdosfs_vnops.c:
Handle chown() better: return EPERM instead of EINVAL if there are
insufficient permissions; otherwise, allow null changes.


# 63e4f22a 11-Dec-1994 Bruce Evans <bde@FreeBSD.org>

Fix numerous timestamp bugs.

DE_UPDATE was confused with DE_MODIFIED in some places (they do have
confusing names). Handle them exactly the same as IN_UPDATE and
IN_MODIFIED. This fixes chmod() and chown() clobbering the mtime
and other bugs.

DE_MODIFIED was set but not used.

Parenthesize macro args.

DE_TIMES() now takes a timeval arg instead of a timespec arg. It was
stupid to use a macro for speed and do unused conversions to prepare
for the macro.

Restore the left shifting of the DOS seconds count by 1. It got
lost among the shifts for the bitfields, so DOS seconds counts
appeared to range from 0 to 29 seconds (step 1) instead of 0 to 58
seconds (step 2).

Actually use the passed-in mtime in deupdat() as documented so that
utimes() works.

Change `extern __inline's to `static inline's so that msdosfs_fat.o
can be linked when it is compiled without -O.

Remove faking of directory mtimes to always be the current time. It's
more surprising for directory mtimes to change when you read the
directories than for them not to change when you write the directories.
This should be controlled by a mount-time option if at all.


# f0707215 10-Oct-1994 Poul-Henning Kamp <phk@FreeBSD.org>

Cosmetics. Silence gcc -Wall


# 82478919 06-Oct-1994 David Greenman <dg@FreeBSD.org>

Use tsleep() rather than sleep so that 'ps' is more informative about
the wait.


# c3c6d51e 27-Sep-1994 Poul-Henning Kamp <phk@FreeBSD.org>

Added declarations, fixed bugs due to missing decls. At least one of them
could panic a system. (I know, it paniced mine!).


# 27a0bc89 19-Sep-1994 Doug Rabson <dfr@FreeBSD.org>

Added msdosfs.

Obtained from: NetBSD