History log of /freebsd-current/sys/fs/msdosfs/msdosfs_vnops.c
Revision Date Author Comments
# 8b67c670 17-Feb-2024 Stefan Eßer <se@FreeBSD.org>

msdosfs: fix directory corruption after rename operation

The is a bug in MSDOSFS that can be triggered when the target of a
rename operation exists. It is caused by the lack of inodes in the
FAT file system, which are substituted by the location of the DOS 8.3
directory entry in the file system. This causes the "inode" of a file
to change when its directory entry is moved to a different location.

The rename operation wants to re-use the existing directory entry
position of an existing target file name (POS1). But the code does
instead locate the first position in the directory that provides
sufficient free directory slots (POS2) to hold the target file name
and fills it with the directory data.

The rename operation continues and at the end writes directory data to
the initially retrieved location (POS1) of the old target directory.
This leads to 2 directory entries for the target file, but with
inconsistent data in the directory and in the cached file system
state.

The location that should have been re-used (POS1) is marked as deleted
in the directory, and new directory data has been written to a
different location (POS2). But the VFS cache has the newly written
data stored under the inode number that corresponds to the initially
planned position (POS1).

If then a new file is written, it can allocate the deleted directory
entries (POS1) and when it queries the cache, it retrieves data that
is valid for the target of the prior rename operation, leading to a
corrupt directory entry (at POS1) being written (DOS file name of the
earlier rename target combined with the Windows long file name of the
newly written file).

PR: 268005
Reported by: wbe@psr.com
Approved by: kib, mckusick
Fixes: 2c9cbc2d45b94
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D43951


# 661db9b3 17-Jan-2024 Konstantin Belousov <kib@FreeBSD.org>

msdosfs_rename(): implement several XXXs about downgrading to ro

Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43482


# be0df848 17-Jan-2024 Konstantin Belousov <kib@FreeBSD.org>

msdosfs_rename(): handle errors from msdosfs_lookup_ino()

Properly working storage and correct filesystem structure indeed only
allow the EJUSTRETURN return code, but since the called function needs
to read directory blocks and (re)parse the content, the assert is not
neccessary hold.

PR: 276408
Reported by: John F. Carr
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D43482


# 71625ec9 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c comment pattern

Remove /^/[*/]\s*\$FreeBSD\$.*\n/


# 0728695c 25-Apr-2023 Stefan Eßer <se@FreeBSD.org>

fs/msdosfs: Fix potential panic and size calculations

Some combinations of FAT12 file system parameters could cause a kernel
panic due to an unmapped access if the size of the FAT was larger than
the CPU page size. The reason is that FAT12 uses 3 bytes to store
2 FAT pointers, leading to partial FAT pointers at the end of buffers
of a size that is not a multiple of 3.

With a typical page size of 4 KB, this caused the FAT entry at byte
offsets 4095 and 4096 to cross the page boundary, with only the first
page mapped. This was fixed by adjusting the mapping to always cover
both bytes of each FAT entry.

Testing revealed 2 other inconsistencies that are fixed by this commit:

1) The calculation of the size of the data area did not take into
account the fact that the first two data block numbers are reserved
and that the data area starts with block 2. This could cause a
FAT12 file system created with the maximum supported number of
blocks to be incorrectly identified as FAT16.

2) The root directory does not take up space in the data area of a
FAT12 or FAT16 file system, since it is placed into a reserved
area outside of that data area. This commits makes stat() report
the logical size of the root directory, but with 0 blocks allocated
from the data area.

PR: 270587
Reviewed by: mckusick
Differential Revision: https://reviews.freebsd.org/D39386


# 8f7859e8 14-Dec-2022 Mateusz Guzik <mjg@FreeBSD.org>

vfs: retire the now unused SAVESTART flag

Bump __FreeBSD_version to 1400075

Tested by: pho


# 56da4aa5 14-Dec-2022 Mateusz Guzik <mjg@FreeBSD.org>

vfs: stop using SAVESTART for rename

ni_startdir has never reached rename routines anyway

Reviewed by: mckusick
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D34468


# a9c439ba 18-Sep-2022 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: truncate write if it would exceed the fs max file size or RLIMIT_FSIZE

PR: 164793
Reviewed by: asomers, jah, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D36625


# 701b7385 18-Sep-2022 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: disallow truncation to set file size past RLIMIT_FSIZE

PR: 164793
Reviewed by: asomers, jah, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D36625


# cc65a412 18-Sep-2022 Konstantin Belousov <kib@FreeBSD.org>

filesystems: return error from vn_rlimit_fsize() instead of EFBIG

Reviewed by: asomers, jah, markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D36625


# 5b5b7e2c 17-Sep-2022 Mateusz Guzik <mjg@FreeBSD.org>

vfs: always retain path buffer after lookup

This removes some of the complexity needed to maintain HASBUF and
allows for removing injecting SAVENAME by filesystems.

Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D36542


# 65990b68 04-Jan-2022 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: clusterfree() is used only in error handling cases

Change its return type to void, because its result is ignored in both
call sites. Remove oldcnp argument as well, it is NULL always.

Suggested and reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33721


# b214fcce 13-Dec-2021 Alan Somers <asomers@FreeBSD.org>

Change VOP_READDIR's cookies argument to a **uint64_t

The cookies argument is only used by the NFS server. NFSv2 defines the
cookie as 32 bits on the wire, but NFSv3 increased it to 64 bits. Our
VOP_READDIR, however, has always defined it as u_long, which is 32 bits
on some architectures. Change it to 64 bits on all architectures. This
doesn't matter for any in-tree file systems, but it matters for some
FUSE file systems that use 64-bit directory cookies.

PR: 260375
Reviewed by: rmacklem
Differential Revision: https://reviews.freebsd.org/D33404


# 2bd6d910 19-Oct-2021 Konstantin Belousov <kib@FreeBSD.org>

msdosfs_rename: remove write-only variables

Reviewed by: imp, mjg
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32577


# b4a58fbf 01-Oct-2021 Mateusz Guzik <mjg@FreeBSD.org>

vfs: remove cn_thread

It is always curthread.

Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D32453


# 197a4f29 16-Sep-2021 Konstantin Belousov <kib@FreeBSD.org>

buffer pager: allow get_blksize method to return error

Reported and reviewed by: asomers
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31998


# 6ae13c0f 07-Aug-2021 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: add doscheckpath lock

Similar to the UFS revision 8df4bc48c89a130207

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31464


# 95d42526 01-Aug-2021 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: fix rename

Use the same locking algorithm for msdosfs_rename() as used by ufs_rename().
Convert doscheckpath() to non-sleeping version.

Reported by: trasz
PR: 257522
In collaboration with: pho
Reviewed by: mckusick
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31464


# 2bfd8992 14-Feb-2021 Konstantin Belousov <kib@FreeBSD.org>

vnode: move write cluster support data to inodes.

The data is only needed by filesystems that
1. use buffer cache
2. utilize clustering write support.

Requested by: mjg
Reviewed by: asomers (previous version), fsu (ext2 parts), mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28679


# b8073b3c 31-Jan-2021 Edward Tomasz Napierala <trasz@FreeBSD.org>

msdosfs: fix vnode leak with msdosfs_rename()

This could happen when failing due to disappearing source file.

Reviewed By: kib
Tested by: pho
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27338


# cb696212 31-Jan-2021 Edward Tomasz Napierala <trasz@FreeBSD.org>

msdosfs: fix double unlock if the source file disappears

We would unlock fvp here, only to unlock it again below,
just before "bad".

Reviewed By: kib
Tested by: pho
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27339


# 599f9044 27-Dec-2020 Mark Johnston <markj@FreeBSD.org>

msdosfs: Fix a leak of dirent padding bytes

This was missed in r340856 / commit
6d2e2df764199f0a15fd743e79599391959cc17d. Three bytes from the kernel
stack may be leaked when reading directory entries.

Reported by: Syed Faraz Abrar <faraz@elttam.com>
MFC after: 3 days
Sponsored by: The FreeBSD Foundation


# 1b3cb4dc 19-Nov-2020 Konstantin Belousov <kib@FreeBSD.org>

msdosfs: Add trivial support for suspension.

Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27269


# d292b194 05-Aug-2020 Mateusz Guzik <mjg@FreeBSD.org>

vfs: remove the obsolete privused argument from vaccess

This brings argument count down to 6, which is passable without the
stack on amd64.


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# b249ce48 03-Jan-2020 Mateusz Guzik <mjg@FreeBSD.org>

vfs: drop the mostly unused flags argument from VOP_UNLOCK

Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by: kib (previous version)
Differential Revision: https://reviews.freebsd.org/D21427


# 6fa079fc 15-Dec-2019 Mateusz Guzik <mjg@FreeBSD.org>

vfs: flatten vop vectors

This eliminates the following loop from all VOP calls:

while(vop != NULL && \
vop->vop_spare2 == NULL && vop->vop_bypass == NULL)
vop = vop->vop_default;

Reviewed by: jeff
Tesetd by: pho
Differential Revision: https://reviews.freebsd.org/D22738


# 454409a3 06-Sep-2019 Conrad Meyer <cem@FreeBSD.org>

msdosfs: Remove redundant brelse() after r294954

Same automation.

No functional change.


# 416e232c 21-Dec-2018 Bruce Evans <bde@FreeBSD.org>

Fix clobbering of the fatchain cache for clustered i/o's when full
clustering is not done. The bug caused extreme slowness for large
files in some cases.

There is no way to tell VOP_BMAP() how many blocks are wanted, so for
all file systems it has to waste time in some cases by searching for
more contiguous blocks than will be accessed. For msdosfs, it also
clobbered the fatchain cache in these cases by advancing the cache to
point to the chain entry for block that won't be read. This makes
the cache useless for the next sequential i/o (or VOP_BMAP()), so the
fat chain is searched from the beginning. The cache only has 1 relevant
entry, so it is similarly useless for random i/o.

Fix this by only advancing the cache to point to the chain entry for
the first block that will be read. Clustering uses results from
VOP_BMAP(), so when more than 1 block is read by clustering, the cache
is not advanced as optimally as before, but it is at most 1 cluster
size behind and searching the chain through the blocks for this cluster
doesn't take too long.


# cc426dd3 11-Dec-2018 Mateusz Guzik <mjg@FreeBSD.org>

Remove unused argument to priv_check_cred.

Patch mostly generated with cocinnelle:

@@
expression E1,E2;
@@

- priv_check_cred(E1,E2,0)
+ priv_check_cred(E1,E2)

Sponsored by: The FreeBSD Foundation


# 6d2e2df7 23-Nov-2018 Mark Johnston <markj@FreeBSD.org>

Ensure that directory entry padding bytes are zeroed.

Directory entries must be padded to maintain alignment; in many
filesystems the padding was not initialized, resulting in stack
memory being copied out to userspace. With the ino64 work there
are also some explicit pad fields in struct dirent. Add a subroutine
to clear these bytes and use it in the in-tree filesystems. The
NFS client is omitted for now as it was fixed separately in r340787.

Reported by: Thomas Barabosch, Fraunhofer FKIE
Reviewed by: kib
MFC after: 3 days
Sponsored by: The FreeBSD Foundation


# 1c4ca778 14-Nov-2018 Konstantin Belousov <kib@FreeBSD.org>

Add d_off support for multiple filesystems.

The d_off field has been added to the dirent structure recently.
Currently filesystems don't support this feature. Support has been
added and tested for zfs, ufs, ext2fs, fdescfs, msdosfs and unionfs.
A stub implementation is available for cd9660, nandfs, udf and
pseudofs but hasn't been tested.

Motivation for this feature: our usecase is for a userspace nfs server
(nfs-ganesha) with zfs. At the moment we cache direntry offsets by
calling lseek once per entry, with this patch we can get the offset
directly from getdirentries(2) calls which provides a significant
speedup.

Submitted by: Jack Halford <jack@gandi.net>
Reviewed by: mckusick, pfg, rmacklem (previous versions)
Sponsored by: Gandi.net
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D17917


# 195e6c50 30-Jul-2018 Ed Maste <emaste@FreeBSD.org>

msdosfs: trim EOL whitespace


# 22e56aea 30-Jul-2018 Ed Maste <emaste@FreeBSD.org>

msdosfs: use same max filesize #define as NetBSD and move to header

For use by makefs msdosfs support.

Obtained from: NetBSD denode.h 1.6
Sponsored by: The FreeBSD Foundation


# b732ceb6 06-May-2018 Pedro F. Giffuni <pfg@FreeBSD.org>

msdosfs: use vfs_timestamp() to generate timestamps instead of getnanotime().

Most filesystems, with the notable exceptions of msdosfs and autofs use
only vfs_timestamp() to read the current time. This has the benefit of
configurable granularity (using the vfs.timestamp_precision sysctl).

For convenience, use it on msdosfs too.

Submitted by: Damjan Jovanovic
Differential Revision: https://reviews.freebsd.org/D15297


# 599afe53 19-Dec-2017 John Baldwin <jhb@FreeBSD.org>

Move NAME_MAX, LINK_MAX, and CHOWN_RESTRICTED out of vop_stdpathconf().

Having all filesystems fall through to default values isn't always correct
and these values can vary for different filesystem implementations. Most
of these changes just use the existing default values with a few exceptions:
- Don't report CHOWN_RESTRICTED for ZFS since it doesn't do the exact
permissions check this claims for chown().
- Use NANDFS_NAME_LEN for NAME_MAX for nandfs.
- Don't report a LINK_MAX of 0 on smbfs. Now fail with EINVAL to
indicate hard links aren't supported.

Requested by: bde (though perhaps not this exact implementation)
Reviewed by: kib (earlier version)
MFC after: 1 month
Sponsored by: Chelsio Communications


# 853b3a8a 19-Dec-2017 John Baldwin <jhb@FreeBSD.org>

Support _PC_FILESIZEBITS in msdosfs' VOP_PATHCONF().

MFC after: 1 month
Sponsored by: Chelsio Communications


# d63027b6 27-Nov-2017 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/fs: further adoption of SPDX licensing ID tags.

Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.


# 60965b76 20-Nov-2017 Conrad Meyer <cem@FreeBSD.org>

msdosfs(5): Reflect READONLY attribute in file mode

Msdosfs allows setting READONLY by clearing the owner write bit of the file
mode. (While here, correct the misspelling of S_IWUSR as VWRITE. No
functional change.)

In msdosfs_getattr, intuitively reflect that READONLY attribute to userspace
in the file mode.

Reported by: Karl Denninger <karl AT denninger.net>
Sponsored by: Dell EMC Isilon


# 027bebe8 18-Oct-2017 Ed Maste <emaste@FreeBSD.org>

msdosfs: fix build with MSDOSFS_DEBUG

Inspired by a patch submission by longwitz@incore.de with many changes
for ino64 in HEAD.

PR: 199152
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation


# 15a88f81 11-Jul-2017 John Baldwin <jhb@FreeBSD.org>

Consistently use vop_stdpathconf() for default pathconf values.

Update filesystems not currently using vop_stdpathconf() in pathconf
VOPs to use vop_stdpathconf() for any configuration variables that do
not have filesystem-specific values. vop_stdpathconf() is used for
variables that have system-wide settings as well as providing default
values for some values based on system limits. Filesystems can still
explicitly override individual settings.

PR: 219851
Reported by: cem
Reviewed by: cem, kib, ngie
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D11541


# 40373cf5 08-Jun-2017 Konstantin Belousov <kib@FreeBSD.org>

Remove msdosfs -o large support.

Its purpose was to translate the values for msdosfs inode numbers,
which is calculated from the msdosfs structures describing the file,
into the range representable by 32bit ino_t. The translation acted
for filesystems larger than 128Gb, it reserved the range 0xf0000000
(FILENO_FIRST_DYN) to UINT32_MAX and remembered some arbitrary
translation of ino >= FILENO_FIRST_DYN into this range. It consumed
memory that could be only freed by unmount, and the translation was
not stable across remounts.

With ino_t type extended to 64 bit, there is no such issue and values
can be returned without compaction to 32bit. That is, for the native
environments, the translation layer is not necessary and adds
significant undeserved code complexity. For compat ABIs which use
32bit ino_t, the vfs.ino64_trunc_error sysctl provides some measures
to soften the failure mode when inode numbers truncation is not safe.

Discussed with: bde
Sponsored by: The FreeBSD Foundation


# 6a1c2e1f 02-Jun-2017 Ed Maste <emaste@FreeBSD.org>

msdosfs: use mem{cpy,move,set} instead of bcopy,bzero

This somewhat simplifies use of msdosfs code in userland (for makefs),
reduces diffs with NetBSD and is standard C as of C89.

Reviewed by: imp
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D11014


# 9287dbaa 21-May-2017 Ed Maste <emaste@FreeBSD.org>

msdosfs: capitalize FAT appropriately

Diff reduction with NetBSD, including some nearby minor whitespace or
style fixes.

Obtained from: NetBSD
Sponsored by: The FreeBSD Foundation


# 9ed01c32 17-Apr-2017 Gleb Smirnoff <glebius@FreeBSD.org>

All these files need sys/vmmeter.h, but now they got it implicitly
included via sys/pcpu.h.


# 06965e96 28-Oct-2016 Konstantin Belousov <kib@FreeBSD.org>

Use buffer pager for msdosfs.

Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# 2aa39445 28-Oct-2016 Konstantin Belousov <kib@FreeBSD.org>

Enable vn_io_fault() deadlock avoidance for msdosfs.

Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# 47e61f6c 15-Aug-2016 Konstantin Belousov <kib@FreeBSD.org>

Implement VOP_FDATASYNC() for msdosfs.

Standard VOP_FSYNC() implementation just syncs data buffers, and due
to this, is the correct and efficient implementation for msdosfs or
any other filesystem which uses bufer cache trivially. Provide
globally visible wrapper vop_stdfdatasync_buf() for future consumption
by other filesystems.

Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D7471


# b3a15ddd 29-Apr-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/fs: spelling fixes in comments.

No functional change.


# db7e4ae8 07-Feb-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

msdosfs_rename: yet another unused value.

As with r295355, it seems to be left over from a cleanup
in r33548. The code is not in NetBSD either.

Thanks to bde for checking out the history.


# 2799a46f 06-Feb-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

msdosfs_rename: Unused value

Assigned value to pmp, is immediatedly overwritten before it can be used.

CID: 1304892


# 10c9700f 09-Jan-2015 Ed Maste <emaste@FreeBSD.org>

ANSIfy sys/fs/msdosfs

There are a number of msdosfs improvements in NetBSD that may be worth
bringing over, and this reduces noise in the comparison.

Differential Revision: https://reviews.freebsd.org/D1466
Reviewed by: kib
Sponsored by: The FreeBSD Foundation


# 789bdfdb 21-Dec-2014 Konstantin Belousov <kib@FreeBSD.org>

Handle MAKEENTRY cnp flag in the VOP_CREATE(). Curiously, some
fs, e.g. smbfs, already did it.

Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks


# df5c9c04 11-Oct-2014 Konstantin Belousov <kib@FreeBSD.org>

Do not set IN_ACCESS flag for read-only mounts. The IN_ACCESS
survives remount in rw, also it is set for vnodes on rootfs before
noatime can be set or clock is adjusted. All conditions result in
wrong atime for accessed vnodes.

Submitted by: bde
MFC after: 1 week


# 7b81a399 17-Jun-2014 Konstantin Belousov <kib@FreeBSD.org>

In msdosfs_setattr(), add a check for result of the utimes(2)
permissions test, forgotten in r164033.

Refactor the permission checks for utimes(2) into vnode helper
function vn_utimes_perm(9), and simplify its code comparing with the
UFS origin, by writing the call to VOP_ACCESSX only once. Use the
helper for UFS(5), tmpfs(5), devfs(5) and msdosfs(5).

Reported by: bde
Reviewed by: bde, trasz
Sponsored by: The FreeBSD Foundation
MFC after: 1 week


# 08cf5ceb 03-May-2014 Konstantin Belousov <kib@FreeBSD.org>

After r254627, the deupdate() started writing the directory entries to
disk. That has a side effect of corrupting the "." entries names on
rename, since the call to createde() in the msdosfs_rename() sets the
de_Name to the target name. If any change to the directory attributes
is performed, the wrong name is written back to the on-disk direntry
on update.

Overwrite the de_Name for the directories on rename to correct the dot
name.

Submitted by: bde
MFC after: 1 week


# 7da1a731 21-Aug-2013 Kenneth D. Merry <ken@FreeBSD.org>

Expand the use of stat(2) flags to allow storing some Windows/DOS
and CIFS file attributes as BSD stat(2) flags.

This work is intended to be compatible with ZFS, the Solaris CIFS
server's interaction with ZFS, somewhat compatible with MacOS X,
and of course compatible with Windows.

The Windows attributes that are implemented were chosen based on
the attributes that ZFS already supports.

The summary of the flags is as follows:

UF_SYSTEM: Command line name: "system" or "usystem"
ZFS name: XAT_SYSTEM, ZFS_SYSTEM
Windows: FILE_ATTRIBUTE_SYSTEM

This flag means that the file is used by the
operating system. FreeBSD does not enforce any
special handling when this flag is set.

UF_SPARSE: Command line name: "sparse" or "usparse"
ZFS name: XAT_SPARSE, ZFS_SPARSE
Windows: FILE_ATTRIBUTE_SPARSE_FILE

This flag means that the file is sparse. Although
ZFS may modify this in some situations, there is
not generally any special handling for this flag.

UF_OFFLINE: Command line name: "offline" or "uoffline"
ZFS name: XAT_OFFLINE, ZFS_OFFLINE
Windows: FILE_ATTRIBUTE_OFFLINE

This flag means that the file has been moved to
offline storage. FreeBSD does not have any special
handling for this flag.

UF_REPARSE: Command line name: "reparse" or "ureparse"
ZFS name: XAT_REPARSE, ZFS_REPARSE
Windows: FILE_ATTRIBUTE_REPARSE_POINT

This flag means that the file is a Windows reparse
point. ZFS has special handling code for reparse
points, but we don't currently have the other
supporting infrastructure for them.

UF_HIDDEN: Command line name: "hidden" or "uhidden"
ZFS name: XAT_HIDDEN, ZFS_HIDDEN
Windows: FILE_ATTRIBUTE_HIDDEN

This flag means that the file may be excluded from
a directory listing if the application honors it.
FreeBSD has no special handling for this flag.

The name and bit definition for UF_HIDDEN are
identical to the definition in MacOS X.

UF_READONLY: Command line name: "urdonly", "rdonly", "readonly"
ZFS name: XAT_READONLY, ZFS_READONLY
Windows: FILE_ATTRIBUTE_READONLY

This flag means that the file may not written or
appended, but its attributes may be changed.

ZFS currently enforces this flag, but Illumos
developers have discussed disabling enforcement.

The behavior of this flag is different than MacOS X.
MacOS X uses UF_IMMUTABLE to represent the DOS
readonly permission, but that flag has a stronger
meaning than the semantics of DOS readonly permissions.

UF_ARCHIVE: Command line name: "uarch", "uarchive"
ZFS_NAME: XAT_ARCHIVE, ZFS_ARCHIVE
Windows name: FILE_ATTRIBUTE_ARCHIVE

The UF_ARCHIVED flag means that the file has changed and
needs to be archived. The meaning is same as
the Windows FILE_ATTRIBUTE_ARCHIVE attribute, and
the ZFS XAT_ARCHIVE and ZFS_ARCHIVE attribute.

msdosfs and ZFS have special handling for this flag.
i.e. they will set it when the file changes.

sys/param.h: Bump __FreeBSD_version to 1000047 for the
addition of new stat(2) flags.

chflags.1: Document the new command line flag names
(e.g. "system", "hidden") available to the
user.

ls.1: Reference chflags(1) for a list of file flags
and their meanings.

strtofflags.c: Implement the mapping between the new
command line flag names and new stat(2)
flags.

chflags.2: Document all of the new stat(2) flags, and
explain the intended behavior in a little
more detail. Explain how they map to
Windows file attributes.

Different filesystems behave differently
with respect to flags, so warn the
application developer to take care when
using them.

zfs_vnops.c: Add support for getting and setting the
UF_ARCHIVE, UF_READONLY, UF_SYSTEM, UF_HIDDEN,
UF_REPARSE, UF_OFFLINE, and UF_SPARSE flags.

All of these flags are implemented using
attributes that ZFS already supports, so
the on-disk format has not changed.

ZFS currently doesn't allow setting the
UF_REPARSE flag, and we don't really have
the other infrastructure to support reparse
points.

msdosfs_denode.c,
msdosfs_vnops.c: Add support for getting and setting
UF_HIDDEN, UF_SYSTEM and UF_READONLY
in MSDOSFS.

It supported SF_ARCHIVED, but this has been
changed to be UF_ARCHIVE, which has the same
semantics as the DOS archive attribute instead
of inverse semantics like SF_ARCHIVED.

After discussion with Bruce Evans, change
several things in the msdosfs behavior:

Use UF_READONLY to indicate whether a file
is writeable instead of file permissions, but
don't actually enforce it.

Refuse to change attributes on the root
directory, because it is special in FAT
filesystems, but allow most other attribute
changes on directories.

Don't set the archive attribute on a directory
when its modification time is updated.
Windows and DOS don't set the archive attribute
in that scenario, so we are now bug-for-bug
compatible.

smbfs_node.c,
smbfs_vnops.c: Add support for UF_HIDDEN, UF_SYSTEM,
UF_READONLY and UF_ARCHIVE in SMBFS.

This is similar to changes that Apple has
made in their version of SMBFS (as of
smb-583.8, posted on opensource.apple.com),
but not quite the same.

We map SMB_FA_READONLY to UF_READONLY,
because UF_READONLY is intended to match
the semantics of the DOS readonly flag.
The MacOS X code maps both UF_IMMUTABLE
and SF_IMMUTABLE to SMB_FA_READONLY, but
the immutable flags have stronger meaning
than the DOS readonly bit.

stat.h: Add definitions for UF_SYSTEM, UF_SPARSE,
UF_OFFLINE, UF_REPARSE, UF_ARCHIVE, UF_READONLY
and UF_HIDDEN.

The definition of UF_HIDDEN is the same as
the MacOS X definition.

Add commented-out definitions of
UF_COMPRESSED and UF_TRACKED. They are
defined in MacOS X (as of 10.8.2), but we
do not implement them (yet).

ufs_vnops.c: Add support for getting and setting
UF_ARCHIVE, UF_HIDDEN, UF_OFFLINE, UF_READONLY,
UF_REPARSE, UF_SPARSE, and UF_SYSTEM in UFS.
Alphabetize the flags that are supported.

These new flags are only stored, UFS does
not take any action if the flag is set.

Sponsored by: Spectra Logic
Reviewed by: bde (earlier version)


# 293e4eb6 02-May-2013 Konstantin Belousov <kib@FreeBSD.org>

The fsync(2) call should sync the vnode in such way that even after
system crash which happen after successfull fsync() return, the data
is accessible. For msdosfs, this means that FAT entries for the file
must be written.

Since we do not track the FAT blocks containing entries for the
current file, just do a sloppy sync of the devvp vnode for the mount,
which buffers, among other things, contain FAT blocks.

Simultaneously, for deupdat():
- optimize by clearing the modified flags before short-circuiting a
return, if the mount is read-only;
- only ignore the rest of the function for denode with DE_MODIFIED
flag clear when the waitfor argument is false. The directory buffer
for the entry might be of delayed write;
- microoptimize by comparing the updated directory entry with the
current block content;
- try to cluster the write, fall back to bawrite() if low on
resources.

Based on the submission by: bde
MFC after: 2 weeks


# c535690b 14-Mar-2013 Konstantin Belousov <kib@FreeBSD.org>

Add currently unused flag argument to the cluster_read(),
cluster_write() and cluster_wbuild() functions. The flags to be
allowed are a subset of the GB_* flags for getblk().

Sponsored by: The FreeBSD Foundation
Tested by: pho


# 9ec062dd 01-Feb-2013 Konstantin Belousov <kib@FreeBSD.org>

The directory entry for dotdot was corrupted in the FAT32 case when moving
a directory to a subdir of the root directory from somewhere else.

For all directory moves that change the parent directory, the dotdot
entry must be fixed up. For msdosfs, the root directory is magic for
non-FAT32. It is less magic for FAT32, but needs the same magic for
the dotdot fixup. It didn't have it.

Both chkdsk and fsck_msdosfs fix the corrupt directory entries with no
problems.

The fix is to use the same magic for dotdot in msdosfs_rename() as in
msdosfs_mkdir().

For msdosfs_mkdir(), document the magic. When writing the dotdot entry
in mkdir, use explicitly set pcl variable instead on relying on the
start cluster of the root directory typically has a value < 65536.

Submitted by: bde
MFC after: 1 week


# c52fd858 23-Apr-2012 Edward Tomasz Napierala <trasz@FreeBSD.org>

Remove unused thread argument from vtruncbuf().

Reviewed by: kib


# 2aacee77 22-Feb-2012 Konstantin Belousov <kib@FreeBSD.org>

Use DOINGASYNC() to test for async allowance, to honor VFS syncing requests.

Noted by: bde
MFC after: 1 week


# 526d0bd5 20-Feb-2012 Konstantin Belousov <kib@FreeBSD.org>

Fix found places where uio_resid is truncated to int.

Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the
sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from
the usermode.

Discussed with: bde, das (previous versions)
MFC after: 1 month


# 6bccea7c 21-Feb-2011 Rebecca Cran <brucec@FreeBSD.org>

Fix typos - remove duplicate "the".

PR: bin/154928
Submitted by: Eitan Adler <lists at eitanadler.com>
MFC after: 3 days


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# b0d53911 08-Oct-2010 Konstantin Belousov <kib@FreeBSD.org>

Add a comment describing the reason for calling cache_purge(fvp).

Requested by: danfe
MFC after: 6 days


# 4d477d5c 07-Oct-2010 Konstantin Belousov <kib@FreeBSD.org>

The msdosfs lookup is case insensitive. Several aliases may be inserted for
a single directory entry. As a consequnce, name cache purge done by lookup
for fvp when DELETE op for namei is specified, might be not enough to
expunge all namecache entries that were installed for this direntry.

Explicitely call cache_purge(fvp) when msdosfs_rename() succeeded.

PR: kern/93634
MFC after: 1 week


# 307d88b7 06-May-2010 Edward Tomasz Napierala <trasz@FreeBSD.org>

Style fixes and removal of unneeded variable.

Submitted by: bde@


# b5f770bd 05-May-2010 Edward Tomasz Napierala <trasz@FreeBSD.org>

Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize().

Reviewed by: kib


# b2c1c014 24-Mar-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r204467:
Remove seemingly unneeded unlock/relock of the dvp in msdosfs_rmdir,
causing LOR.


# 2e45cc5b 28-Feb-2010 Konstantin Belousov <kib@FreeBSD.org>

Remove seemingly unneeded unlock/relock of the dvp in msdosfs_rmdir,
causing LOR.

Reported and tested by: pho
MFC after: 3 weeks


# e623216f 27-Feb-2010 Konstantin Belousov <kib@FreeBSD.org>

MFC r203827:
- Add idempotency guards so the structures can be used in other utilities.
- Update bpb structs with reserved fields.
- In direntry struct join deName with deExtension. Although a
fix was attempted in the past, these fields were being overflowed,
Now this is consistent with the spec, and we can now share the
WinChksum code with NetBSD.


# 8fa03d08 20-Feb-2010 Ulrich Spörlein <uqs@FreeBSD.org>

Fix common misspelling of hierarchy

Pointed out by: bf1783 at gmail
Approved by: np (cxgb), kientzle (tar, etc.), philip (mentor)


# 48d1bcf8 12-Feb-2010 Konstantin Belousov <kib@FreeBSD.org>

- Add idempotency guards so the structures can be used in other utilities.
- Update bpb structs with reserved fields.
- In direntry struct join deName with deExtension. Although a
fix was attempted in the past, these fields were being overflowed,
Now this is consistent with the spec, and we can now share the
WinChksum code with NetBSD.

Submitted by: Pedro F. Giffuni <giffunip tutopia com>
Mostly obtained from: NetBSD
Reviewed by: bde
MFC after: 2 weeks


# d6da6408 10-Jun-2009 Konstantin Belousov <kib@FreeBSD.org>

Fix r193923 by noting that type of a_fp is struct file *, not int.
It was assumed that r193923 was trivial change that cannot be done
wrong.

MFC after: 2 weeks


# e4d9bdc1 10-Jun-2009 Konstantin Belousov <kib@FreeBSD.org>

s/a_fdidx/a_fp/ for VOP_OPEN comments that inline struct vop_open_args
definition.

Discussed with: bde
MFC after: 2 weeks


# c72ae142 27-Feb-2009 John Baldwin <jhb@FreeBSD.org>

- Hold a reference on the cdev a filesystem is mounted from in the mount.
- Remove the cdev pointers from the denode and instead use the mountpoint's
reference to call dev2udev() in getattr().

Reviewed by: kib, julian


# 0da50f6e 16-Dec-2008 Edward Tomasz Napierala <trasz@FreeBSD.org>

According to phk@, VOP_STRATEGY should never, _ever_, return
anything other than 0. Make it so. This fixes
"panic: VOP_STRATEGY failed bp=0xc320dd90 vp=0xc3b9f648",
encountered when writing to an orphaned filesystem. Reason
for the panic was the following assert:
KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp));
at vfs_bio:bufstrategy().

Reviewed by: scottl, phk
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation


# 15bc6b2b 28-Oct-2008 Edward Tomasz Napierala <trasz@FreeBSD.org>

Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.

Approved by: rwatson (mentor)


# 1ede983c 23-Oct-2008 Dag-Erling Smørgrav <des@FreeBSD.org>

Retire the MALLOC and FREE macros. They are an abomination unto style(9).

MFC after: 3 months


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 4c5a20e3 20-Sep-2008 Konstantin Belousov <kib@FreeBSD.org>

Initialize va_rdev to NODEV instead of 0 or VNOVAL in VOP_GETATTR().
NODEV is more appropriate when va_rdev doesn't have a meaningful value.

Submitted by: Jaakko Heinonen <jh saunalahti fi>
Suggested by: bde
Discussed on: freebsd-fs
MFC after: 1 month


# 0359a12e 28-Aug-2008 Attilio Rao <attilio@FreeBSD.org>

Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>


# 813d71de 04-Jul-2008 Konstantin Belousov <kib@FreeBSD.org>

The uniqdosname() function takes char[12] as it third argument.

Found by: -fstack-protector
Reported by: dougb
Tested by: dougb, Rainer Hurling <rhurlin gwdg de>
MFC after: 3 days


# eab626f1 16-Apr-2008 Konstantin Belousov <kib@FreeBSD.org>

Move the head of byte-level advisory lock list from the
filesystem-specific vnode data to the struct vnode. Provide the
default implementation for the vop_advlock and vop_advlockasync.
Purge the locks on the vnode reclaim by using the lf_purgelocks().
The default implementation is augmented for the nfs and smbfs.
In the nfs_advlock, push the Giant inside the nfs_dolock.

Before the change, the vop_advlock and vop_advlockasync have taken the
unlocked vnode and dereferenced the fs-private inode data, racing with
with the vnode reclamation due to forced unmount. Now, the vop_getattr
under the shared vnode lock is used to obtain the inode size, and
later, in the lf_advlockasync, after locking the vnode interlock, the
VI_DOOMED flag is checked to prevent an operation on the doomed vnode.

The implementation of the lf_purgelocks() is submitted by dfr.

Reported by: kris
Tested by: kris, pho
Discussed with: jeff, dfr
MFC after: 2 weeks


# dfdcada3 26-Mar-2008 Doug Rabson <dfr@FreeBSD.org>

Add the new kernel-mode NFS Lock Manager. To use it instead of the
user-mode lock manager, build a kernel with the NFSLOCKD option and
add '-k' to 'rpc_lockd_flags' in rc.conf.

Highlights include:

* Thread-safe kernel RPC client - many threads can use the same RPC
client handle safely with replies being de-multiplexed at the socket
upcall (typically driven directly by the NIC interrupt) and handed
off to whichever thread matches the reply. For UDP sockets, many RPC
clients can share the same socket. This allows the use of a single
privileged UDP port number to talk to an arbitrary number of remote
hosts.

* Single-threaded kernel RPC server. Adding support for multi-threaded
server would be relatively straightforward and would follow
approximately the Solaris KPI. A single thread should be sufficient
for the NLM since it should rarely block in normal operation.

* Kernel mode NLM server supporting cancel requests and granted
callbacks. I've tested the NLM server reasonably extensively - it
passes both my own tests and the NFS Connectathon locking tests
running on Solaris, Mac OS X and Ubuntu Linux.

* Userland NLM client supported. While the NLM server doesn't have
support for the local NFS client's locking needs, it does have to
field async replies and granted callbacks from remote NLMs that the
local client has contacted. We relay these replies to the userland
rpc.lockd over a local domain RPC socket.

* Robust deadlock detection for the local lock manager. In particular
it will detect deadlocks caused by a lock request that covers more
than one blocking request. As required by the NLM protocol, all
deadlock detection happens synchronously - a user is guaranteed that
if a lock request isn't rejected immediately, the lock will
eventually be granted. The old system allowed for a 'deferred
deadlock' condition where a blocked lock request could wake up and
find that some other deadlock-causing lock owner had beaten them to
the lock.

* Since both local and remote locks are managed by the same kernel
locking code, local and remote processes can safely use file locks
for mutual exclusion. Local processes have no fairness advantage
compared to remote processes when contending to lock a region that
has just been unlocked - the local lock manager enforces a strict
first-come first-served model for both local and remote lockers.

Sponsored by: Isilon Systems
PR: 95247 107555 115524 116679
MFC after: 2 weeks


# 22db15c0 13-Jan-2008 Attilio Rao <attilio@FreeBSD.org>

VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>


# cb05b60a 09-Jan-2008 Attilio Rao <attilio@FreeBSD.org>

vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by: Diego Sardina <siarodx at gmail dot com>,
Andrea Di Pasquale <whyx dot it at gmail dot com>


# cb65c1ee 18-Oct-2007 Bruce Evans <bde@FreeBSD.org>

Implement the async (really, delayed-write) mount option for msdosfs.

This is much simpler than for ffs since there are many fewer places
where we need to choose between a delayed write and a sync write --
just 5 in msdosfs and more than 30 in ffs.

This is more complete and correct than in ffs. Several places in ffs
are are still missing the choice. ffs_update() has a layering violation
that breaks callers which want to force a sync update (mainly fsync(2)
and O_SYNC write(2)).

However, fsync(2) and O_SYNC write(2) are still more broken than in
ffs, since they are broken for default (non-sync non-async) mounts
too. Both fail to sync the FAT in all cases, and both fail to sync
the directory entry in some cases after losing a race. Async everything
is probably safer than the half-baked sync of metadata given by default
mounts.


# cefb5582 18-Oct-2007 Bruce Evans <bde@FreeBSD.org>

In msdosfs_settattr(), don't do synchronous updates of the denode
(except indirectly for the size pseudo-attribute). If anything deserves
a sync update, then it is ids and immutable flags, since these are
related to security, but ffs never synced these and msdosfs doesn't
support them. (ufs_setattr() only does an update in one case where
it is least needed (for timestamps); it did pessimal sync updates for
timestamps until 1998/03/08 but was changed for unlogged reasons related
to soft updates.)

Now msdosfs calls deupdat() with waitfor == 0, which normally gives a
delayed update to disk but always gives a sync update of timestamps
in core, while for ffs everything is delayed until the syncer daemon
or other activity causes an update (except for timestamps).

This gives a large optimization mainly for things like cp -p, where
attribute adjustment could easily triple the number of physical I/O's
if it is done synchronously (but cp -p to msdosfs is not as bad as
that, since msdosfs doesn't support many attributes so null adjustments
are more common, and msdosfs doesn't support ctimes so even if cp
doesn't weed out null adjustments they don't become non-null after
clobbering the ctime).


# c2819440 31-Aug-2007 Bruce Evans <bde@FreeBSD.org>

Fix races in msdosfs_lookup() and msdosfs_readdir(). These functions
can easily block in bread(), and then there was nothing to prevent the
static buffer (nambuf_{ptr,len,last_id}) being clobbered by another
thread.

The effects of the bug seem to have been limited to failed lookups and
mangled names in readdir(), since Giant locking provides enough
serialization to prevent concurrent calls to the functions that access
the buffer. They were very obvious for multiple concurrent tree walks,
especially with a small cluster size.

The bug was introduced in msdosfs_conv.c 1.34 and associated changes,
and is in all releases starting with 5.2.

The fix is to allocate the buffer as a local variable and pass around
pointers to it like "_r" functions in libc do. Stack use from this
is large but not too large. This also fixes a memory leak on module
unload.

Reviewed by: kib
Approved by: re (kensmith)


# a4e6807c 07-Aug-2007 Bruce Evans <bde@FreeBSD.org>

In msdosfs_read() and msdosfs_write(), don't check explicitly for
(uio_offset < 0) since this can't happen. If this happens, then the
general code handles the problem safely (better than before for reading,
returning 0 (EOF) instead of the bogus errno EINVAL, and the same as
before for writing, returning EFBIG).

In msdosfs_read(), don't check for (uio_resid < 0). msdosfs_write()
already didn't check.

In msdosfs_read(), document in a comment our assumptions that the caller
passed a valid uio_offset and uio_resid. ffs checks using KASSERT(),
and that is enough sanity checking. In the same comment, partly document
there is no need to check for the EOVERFLOW case, unlike in ffs where this
case can happen at least in theory.

In msdosfs_write(), add a comment about why the checking of
(uio_resid == 0) is explicit, unlike in ffs.

In msdosfs_write(), check for impossibly large final offsets before
checking if the file size rlimit would be exceeded, so that we don't
have an overflow bug in the rlimit check and are consistent with ffs.
We now return EFBIG instead of EFBIG plus a SIGXFSZ signal if the final
offset would be impossibly large but not so large as to cause overflow.
Overflow normally gave the benign behaviour of no signal.

Approved by: re (kensmith) (blanket)


# b7837a91 06-Aug-2007 Bruce Evans <bde@FreeBSD.org>

Fix and update the comments about the effect of the read-only flag on writing.
They are still too verbose.

Remove nearby unreachable code for handling symlinks.

Approved by: re (kensmith) (blanket)


# c0f5121c 06-Aug-2007 Bruce Evans <bde@FreeBSD.org>

Fix some style bugs (don't assume that off_t == int64_t; fix some comments;
remove some parentheses; fix only a couple of whtespace errors).

Approved by: re (kensmith) (blanket)


# d2bb66ba 06-Aug-2007 Bruce Evans <bde@FreeBSD.org>

Sort includes.

Remove rotted banal comment attached to includes.

Approved by: re (kensmith) (blanket)


# eba34270 06-Aug-2007 Bruce Evans <bde@FreeBSD.org>

Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of
depending on namespace pollution in <sys/buf.h> and/or <sys/vnode.h>

Approved by: re (kensmith) (blanket)


# 6fd81fc7 06-Aug-2007 Bruce Evans <bde@FreeBSD.org>

Remove unused include(s).

Approved by: re (kensmith) (blanket)


# 6b6c5f5e 20-Jul-2007 Bruce Evans <bde@FreeBSD.org>

Implement vfs clustering for msdosfs.

This gives a very large speedup for small block sizes (in my tests,
about 5 times for write and 3 times for read with a block size of 512,
if clustering is possible) and a moderate speedup for the moderatatly
large block sizes that should be used on non-small media (4K is the
best size in most cases, and the speedup for that is about 1.3 times
for write and 1.2 times for read). mmap() should benefit from clustering
like read()/write(), but the current implementation of vm only supports
clustering (at least for getpages) if the fs block size is >= PAGE SIZE.

msdosfs is now only slightly slower than ffs with soft updates for
writing and slightly faster for reading when both use their best block
sizes. Writing is slower for msdosfs because of more sync writes.
Reading is faster for msdosfs because indirect blocks interfere with
clustering in ffs.

The changes in msdosfs_read() and msdosfs_write() are simpler merges
of corresponding code in ffs (after fixing some style bugs in ffs).
msdosfs_bmap() needs fs-specific code. This implementation loops
calling a lower level bmap function to do the hard parts. This is a
bit inefficient, but is efficient enough since msdsfs_bmap() is only
called when there is physical i/o to do.

Approved by: re (hrs)


# d34b0a1b 20-Jul-2007 Bruce Evans <bde@FreeBSD.org>

Clean up before implementing vfs clustering for msdosfs:

In msdosfs_read(), mainly reorder the main loop to the same order as in
ffs_read().

In msdosfs_write() and extendfile(), use vfs_bio_clrbuf() instead of
clrbuf(). I think this just just a bogus optimization, but ffs always
does it and msdosfs already did it in one place, and it is what I've
tested.

In msdosfs_write(), merge good bits from a comment in ffs_write(), and
fix 1 style bug.

In the main comment for msdosfs_pcbmap(), improve wording and catch
up with 13 years of changes in the function. This comment belongs in
VOP_BMAP.9 but that doesn't exist.

In msdosfs_bmap(), return EFBIG if the requested cluster number is out
of bounds instead of blindly truncating it, and fix many style bugs.

Approved by: re (hrs)


# 32f9753c 11-Jun-2007 Robert Watson <rwatson@FreeBSD.org>

Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths. Do, however, move those prototypes to priv.h.

Reviewed by: csjp
Obtained from: TrustedBSD Project


# 10bcafe9 15-Feb-2007 Pawel Jakub Dawidek <pjd@FreeBSD.org>

Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method.
This way we may support multiple structures in v_data vnode field within
one file system without using black magic.

Vnode-to-file-handle should be VOP in the first place, but was made VFS
operation to keep interface as compatible as possible with SUN's VFS.
BTW. Now Solaris also implements vnode-to-file-handle as VOP operation.

VFS_VPTOFH() was left for API backward compatibility, but is marked for
removal before 8.0-RELEASE.

Approved by: mckusick
Discussed with: many (on IRC)
Tested with: ufs, msdosfs, cd9660, nullfs and zfs


# 61ad2e26 30-Jan-2007 Tai-hwa Liang <avatar@FreeBSD.org>

Fixing compilation bustage by removing references to opt_msdosfs.h.

This auto-generated header file no longer exists since the removal of
MSDOSFS_LARGE in sys/conf/options:1.574.


# f458f2a5 29-Jan-2007 Craig Rodrigues <rodrigc@FreeBSD.org>

Add a "-o large" mount option for msdosfs. Convert compile-time checks for
#ifdef MSDOSFS_LARGE to run-time checks to see if "-o large" was specified.

Test case provided by Oliver Fromme:
truncate -s 200G test.img
mdconfig -a -t vnode -f test.img -u 9
newfs_msdos -s 419430400 -n 1 /dev/md9 zip250
mount -t msdosfs /dev/md9 /mnt # should fail
mount -t msdosfs -o large /dev/md9 /mnt # should succeed

PR: 105964
Requested by: Oliver Fromme <olli lurza secnetix de>
Tested by: trhodes
MFC after: 2 weeks


# 1c5cf521 03-Dec-2006 Maxim Konovalov <maxim@FreeBSD.org>

o Do not leave uninitialized birthtime: in MSDOSFSMNT_LONGNAME
set birthtime to FAT CTime (creation time) and in the other cases
set birthtime to -1.

o Set ctime to mtime instead of FAT CTime which has completely
different meaning.

PR: kern/106018
Submitted by: Oliver Fromme
MFC after: 1 month


# acd3428b 06-Nov-2006 Robert Watson <rwatson@FreeBSD.org>

Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges. These may
require some future tweaking.

Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>


# 3c960d93 24-Oct-2006 Poul-Henning Kamp <phk@FreeBSD.org>

Replace slightly crummy fattime<->timespec conversion functions.


# 89b0e109 31-Jan-2006 Jeff Roberson <jeff@FreeBSD.org>

- Reorder calls to vrele() after calls to vput() when the vrele is a
directory. vrele() may lock the passed vnode, which in these cases would
give an invalid lock order of child -> parent. These situations are
deadlock prone although do not typically deadlock because the vrele
is typically not releasing the last reference to the vnode. Users of
vrele must consider it as a call to vn_lock() and order it appropriately.

MFC After: 1 week
Sponsored by: Isilon Systems, Inc.
Tested by: kkenn


# 9fc31f8a 23-Jan-2006 Tom Rhodes <trhodes@FreeBSD.org>

Update incorrect comments here, there should not be a call to panic()
over fs corruption.

Discussed with: alfred, phk


# 710a9acc 22-Jan-2006 Max Khon <fjoe@FreeBSD.org>

Do not assume that `char direntry::deExtension[3]' starts right after
`char direntry::deName[8]' and access deExtension[] explicitly.

Found by: Coverity Prevent(tm)
CID: 350, 351, 352


# 7ce296cf 27-Feb-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Remove debug printout of major/minor numbers, print name instead.


# 72b3e305 29-Jan-2005 Peter Edwards <peadar@FreeBSD.org>

Unbreak a few filesystems for which vnode_create_vobject() wasn't being
called in "open", causing mmap() to fail.

Where possible, pass size of file to vnode_create_vobject() rather
than having it find it out the hard way via VOP_LOOKUP

Reviewed by: phk


# 83c64397 13-Jan-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Whitespace in vop_vector{} initializations.


# 0391e5a1 11-Jan-2005 Poul-Henning Kamp <phk@FreeBSD.org>

Wrap the bufobj operations in macros: BO_STRATEGY() and BO_WRITE()


# d167cf6f 06-Jan-2005 Warner Losh <imp@FreeBSD.org>

/* -> /*- for copyright notices, minor format tweaks as necessary


# 4b440374 02-Dec-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Remove the de_devvp and stop VREF'ing it for every vnode we create.


# aec0fb7b 01-Dec-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Back when VOP_* was introduced, we did not have new-style struct
initializations but we did have lofty goals and big ideals.

Adjust to more contemporary circumstances and gain type checking.

Replace the entire vop_t frobbing thing with properly typed
structures. The only casualty is that we can not add a new
VOP_ method with a loadable module. History has not given
us reason to belive this would ever be feasible in the the
first place.

Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.

Give coda correct prototypes and function definitions for
all vop_()s.

Generate a bit more data from the vnode_if.src file: a
struct vop_vector and protype typedefs for all vop methods.

Add a new vop_bypass() and make vop_default be a pointer
to another struct vop_vector.

Remove a lot of vfs_init since vop_vector is ready to use
from the compiler.

Cast various vop_mumble() to void * with uppercase name,
for instance VOP_PANIC, VOP_NULL etc.

Implement VCALL() by making vdesc_offset the offsetof() the
relevant function pointer in vop_vector. This is disgusting
but since the code is generated by a script comparatively
safe. The alternative for nullfs etc. would be much worse.

Fix up all vnode method vectors to remove casts so they
become typesafe. (The bulk of this is generated by scripts)


# 6fde64c7 30-Nov-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Mechanically change prototypes for vnode operations to use the new typedefs.


# 9c83534d 15-Nov-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Make VOP_BMAP return a struct bufobj for the underlying storage device
instead of a vnode for it.

The vnode_pager does not and should not have any interest in what
the filesystem uses for backend.

(vfs_cluster doesn't use the backing store argument.)


# 9a135592 29-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Move MSDOSFS to GEOM backing instead of DEVFS.

For details, please see src/sys/ufs/ffs/ffs_vfsops.c 1.250.


# d83b7498 27-Oct-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Eliminate unnecessary KASSERTs.

Don't use bp->b_vp in VOP_STRATEGY: the vnode is passed in as an argument.


# 56f21b9d 26-Jul-2004 Colin Percival <cperciva@FreeBSD.org>

Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is
somewhat clearer, but more importantly allows for a consistent naming
scheme for suser_cred flags.

The old name is still defined, but will be removed in a few days (unless I
hear any complaints...)

Discussed with: rwatson, scottl
Requested by: jhb


# 3bc482ec 03-Jul-2004 Tim J. Robbins <tjr@FreeBSD.org>

By popular request, add a workaround that allows large (>128GB or so)
FAT32 filesystems to be mounted, subject to some fairly serious limitations.

This works by extending the internal pseudo-inode-numbers generated from
the file's starting cluster number to 64-bits, then creating a table
mapping these into arbitrary 32-bit inode numbers, which can fit in
struct dirent's d_fileno and struct vattr's va_fileid fields. The mappings
do not persist across unmounts or reboots, so it's not possible to export
these filesystems through NFS. The mapping table may grow to be rather
large, and may grow large enough to exhaust kernel memory on filesystems
with millions of files.

Don't enable this option unless you understand the consequences.


# 91d5354a 04-Feb-2004 John Baldwin <jhb@FreeBSD.org>

Locking for the per-process resource limits structure.
- struct plimit includes a mutex to protect a reference count. The plimit
structure is treated similarly to struct ucred in that is is always copy
on write, so having a reference to a structure is sufficient to read from
it without needing a further lock.
- The proc lock protects the p_limit pointer and must be held while reading
limits from a process to keep the limit structure from changing out from
under you while reading from it.
- Various global limits that are ints are not protected by a lock since
int writes are atomic on all the archs we support and thus a lock
wouldn't buy us anything.
- All accesses to individual resource limits from a process are abstracted
behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return
either an rlimit, or the current or max individual limit of the specified
resource from a process.
- dosetrlimit() was renamed to kern_setrlimit() to match existing style of
other similar syscall helper functions.
- The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit()
(it didn't used the stackgap when it should have) but uses lim_rlimit()
and kern_setrlimit() instead.
- The svr4 compat no longer uses the stackgap for resource limits calls,
but uses lim_rlimit() and kern_setrlimit() instead.
- The ibcs2 compat no longer uses the stackgap for resource limits. It
also no longer uses the stackgap for accessing sysctl's for the
ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result,
ibcs2_sysconf() no longer needs Giant.
- The p_rlimit macro no longer exists.

Submitted by: mtm (mostly, I only did a few cleanups and catchups)
Tested on: i386
Compiled on: alpha, amd64


# be039c56 29-Dec-2003 Bruce Evans <bde@FreeBSD.org>

Fixed some minor style bugs in rev.1.144. All related to msdosfs_advlock()
(mainly unsorting). There were no changes related to the dirty flag
here. The reference NetBSD implementation put msdosfs_advlock() in a
different place. This commit only moves its declarations and changes
some of the function body to be like the NetBSD version.


# cede1f56 26-Dec-2003 Tom Rhodes <trhodes@FreeBSD.org>

Make msdosfs support the dirty flag in FAT16 and FAT32.
Enable lockf support.

PR: 55861
Submitted by: Jun Su <junsu@m-net.arbornet.org> (original version)
Reviewed by: make universe


# 2c18019f 18-Oct-2003 Poul-Henning Kamp <phk@FreeBSD.org>

DuH!

bp->b_iooffset (the spot on the disk), not bp->b_offset (the offset in
the file)


# c87b01a0 18-Oct-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Initialize b_offset before calling VOP_STRATEGY/VOP_SPECSTRATEGY.

Remove various comments of KASSERTS and comments about B_PHYS which
does not apply anymore.


# c4f02a89 26-Sep-2003 Max Khon <fjoe@FreeBSD.org>

- Support for multibyte charsets in LIBICONV.
- CD9660_ICONV, NTFS_ICONV and MSDOSFS_ICONV kernel options
(with corresponding modules).
- kiconv(3) for loadable charset conversion tables support.

Submitted by: Ryuichiro Imura <imura@ryu16.org>


# c98a31ca 12-Aug-2003 Tom Rhodes <trhodes@FreeBSD.org>

Add a '-M mask' option so that users can have different
masks for files and directories. This should make some
of the Midnight Commander users happy.

Remove an extra ')' in the manual page.

PR: 35699
Submitted by: Eugene Grosbein <eugen@grosbein.pp.ru> (original version)
Tested by: simon


# 3c01bab8 03-Jul-2003 Tom Rhodes <trhodes@FreeBSD.org>

If bread() returns a zero-length buffer, as can happen after a
failed write, return an error instead of looping forever.

PR: 37035
Submitted by: das


# cefb5754 15-Jun-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Add the same KASSERT to all VOP_STRATEGY and VOP_SPECSTRATEGY implementations
to check that the buffer points to the correct vnode.


# 67096659 31-May-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unused variable(s).

Found by: FlexeLint


# 7261f5f6 03-Mar-2003 Jeff Roberson <jeff@FreeBSD.org>

- Add a new 'flags' parameter to getblk().
- Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT
flag to the initial BUF_LOCK(). This will eventually be used in cases
were we want to use a buffer only if it is not currently in use.
- Convert all consumers of the getblk() api to use this extra parameter.

Reviwed by: arch
Not objected to by: mckusick


# 99648386 03-Mar-2003 Nate Lawson <njl@FreeBSD.org>

Finish cleanup of vprint() which was begun with changing v_tag to a string.
Remove extraneous uses of vop_null, instead defering to the default op.
Rename vnode type "vfs" to the more descriptive "syncer".
Fix formatting for various filesystems that use vop_print.


# 8994a245 02-Mar-2003 Dag-Erling Smørgrav <des@FreeBSD.org>

Clean up whitespace, s/register //, refrain from strong urge to ANSIfy.


# c9524588 02-Mar-2003 Dag-Erling Smørgrav <des@FreeBSD.org>

uiomove-related caddr_t -> void * (just the low-hanging fruit)


# a163d034 18-Feb-2003 Warner Losh <imp@FreeBSD.org>

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 767b9a52 09-Feb-2003 Jeff Roberson <jeff@FreeBSD.org>

- Cleanup unlocked accesses to buf flags by introducing a new b_vflag member
that is protected by the vnode lock.
- Move B_SCANNED into b_vflags and call it BV_SCANNED.
- Create a vop_stdfsync() modeled after spec's sync.
- Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some
fs specific processing. This gives all of these filesystems proper
behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag.
- Annotate the locking in buf.h


# 44956c98 21-Jan-2003 Alfred Perlstein <alfred@FreeBSD.org>

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# f5b11b6e 04-Jan-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Temporarily introduce a new VOP_SPECSTRATEGY operation while I try
to sort out disk-io from file-io in the vm/buffer/filesystem space.

The intent is to sort VOP_STRATEGY calls into those which operate
on "real" vnodes and those which operate on VCHR vnodes. For
the latter kind, the call will be changed to VOP_SPECSTRATEGY,
possibly conditionally for those places where dual-use happens.

Add a default VOP_SPECSTRATEGY method which will call the normal
VOP_STRATEGY. First time it is called it will print debugging
information. This will only happen if a normal vnode is passed
to VOP_SPECSTRATEGY by mistake.

Add a real VOP_SPECSTRATEGY in specfs, which does what VOP_STRATEGY
does on a VCHR vnode today.

Add a new VOP_STRATEGY method in specfs to catch instances where
the conversion to VOP_SPECSTRATEGY has not yet happened. Handle
the request just like we always did, but first time called print
debugging information.

Apart up to two instances of console messages per boot, this amounts
to a glorified no-op commit.

If you get any of the messages on your console I would very much
like a copy of them mailed to phk@freebsd.org


# c6e3ae99 04-Jan-2003 Poul-Henning Kamp <phk@FreeBSD.org>

Since Jeffr made the std* functions the default in rev 1.63 of
kern/vfs_defaults.c it is wrong for the individual filesystems to use
the std* functions as that prevents override of the default.

Found by: src/tools/tools/vop_table


# 3a3d82ec 28-Dec-2002 Matthew Dillon <dillon@FreeBSD.org>

Abstract-out the constants for the sequential heuristic.

No operational changes.

MFC after: 1 day


# 4d93c0be 24-Sep-2002 Jeff Roberson <jeff@FreeBSD.org>

- Use vrefcnt() where it is safe to do so instead of doing direct and
unlocked accesses to v_usecount.
- Lock access to the buf lists in the various sync routines. interlock
locking could be avoided almost entirely in leaf filesystems if the
fsync function had a generic helper.


# 86ed6d45 18-Sep-2002 Nate Lawson <njl@FreeBSD.org>

Remove any VOP_PRINT that redundantly prints the tag.
Move lockmgr_printinfo() into vprint() for everyone's benefit.

Suggested by: bde


# 06be2aaa 14-Sep-2002 Nate Lawson <njl@FreeBSD.org>

Remove all use of vnode->v_tag, replacing with appropriate substitutes.
v_tag is now const char * and should only be used for debugging.

Additionally:
1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK
2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which
is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.

Suggested by: phk
Reviewed by: bde, rwatson (earlier version)


# f7a91b49 05-Aug-2002 Pierre Beyssac <pb@FreeBSD.org>

Fix typo in vnode flags causing deadlock in msdosfs_fsync().

Reviewed by: jeff


# e6e370a7 04-Aug-2002 Jeff Roberson <jeff@FreeBSD.org>

- Replace v_flag with v_iflag and v_vflag
- v_vflag is protected by the vnode lock and is used when synchronization
with VOP calls is needed.
- v_iflag is protected by interlock and is used for dealing with vnode
management issues. These flags include X/O LOCK, FREE, DOOMED, etc.
- All accesses to v_iflag and v_vflag have either been locked or marked with
mp_fixme's.
- Many ASSERT_VOP_LOCKED calls have been added where the locking was not
clear.
- Many functions in vfs_subr.c were restructured to provide for stronger
locking.

Idea stolen from: BSD/OS


# d394511d 16-May-2002 Tom Rhodes <trhodes@FreeBSD.org>

More s/file system/filesystem/g


# 98b0c789 14-May-2002 Poul-Henning Kamp <phk@FreeBSD.org>

Make daddr_t and u_daddr_t 64bits wide.
Retire daddr64_t and use daddr_t instead.

Sponsored by: DARPA & NAI Labs.


# 32a95d83 05-Apr-2002 Bruce Evans <bde@FreeBSD.org>

Fixed a very old bug in setting timestamps using utimes(2) on msdosfs
files. We didn't clear the update marks when we set the times, so
some of the settings were sometimes clobbered with the current time a
little later. This caused cp -p even by root to almost always fail
to preserve any times despite not reporting any errors in attempting
to preserve them.

Don't forget to set the archive attribute when we set the read-only
attribute. We should only set the archive attribute if we actually
change something, but we mostly don't bother avoiding setting it
elsewhere, so don't bother here yet.

MFC after: 1 week


# 44731cab 01-Apr-2002 John Baldwin <jhb@FreeBSD.org>

Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on: smp@


# 11caded3 19-Mar-2002 Alfred Perlstein <alfred@FreeBSD.org>

Remove __P.


# 0d2af521 15-Mar-2002 Kirk McKusick <mckusick@FreeBSD.org>

Introduce the new 64-bit size disk block, daddr64_t. Change
the bio and buffer structures to have daddr64_t bio_pblkno,
b_blkno, and b_lblkno fields which allows access to disks
larger than a Terabyte in size. This change also requires
that the VOP_BMAP vnode operation accept and return daddr64_t
blocks. This delta should not affect system operation in
any way. It merely sets up the necessary interfaces to allow
the development of disk drivers that work with these larger
disk block addresses. It also allows for the development of
UFS2 which will use 64-bit block addresses.


# 5a7f3ebb 28-Nov-2001 John Baldwin <jhb@FreeBSD.org>

Use suser_td() instead of explicitly checking cr_uid against 0.

PR: kern/21809
Submitted by: <mbendiks@eunet.no>
Reviewed by: rwatson


# b40ce416 12-Sep-2001 Julian Elischer <julian@FreeBSD.org>

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# ac01ecd9 19-Jul-2001 Assar Westerlund <assar@FreeBSD.org>

remove support for creating files and directories from msdosfs_mknod


# 1166fb51 25-May-2001 Ruslan Ermilov <ru@FreeBSD.org>

- sys/msdosfs moved to sys/fs/msdosfs
- msdos.ko renamed to msdosfs.ko
- /usr/include/msdosfs moved to /usr/include/fs/msdosfs


# a62615e5 01-May-2001 Poul-Henning Kamp <phk@FreeBSD.org>

Implement vop_std{get|put}pages() and add them to the default vop[].

Un-copy&paste all the VOP_{GET|PUT}PAGES() functions which do nothing but
the default.


# 60fb0ce3 28-Apr-2001 Greg Lehey <grog@FreeBSD.org>

Revert consequences of changes to mount.h, part 2.

Requested by: bde


# d98dc34f 23-Apr-2001 Greg Lehey <grog@FreeBSD.org>

Correct #includes to work with fixed sys/mount.h.


# 19eb87d2 06-Mar-2001 John Baldwin <jhb@FreeBSD.org>

Grab the process lock while calling psignal and before calling psignal.


# 9ed346ba 08-Feb-2001 Bosko Milekic <bmilekic@FreeBSD.org>

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


# b1b494a7 22-Oct-2000 Boris Popov <bp@FreeBSD.org>

Update stale comment.

PR: kern/21805


# e7b1ac75 22-Oct-2000 Boris Popov <bp@FreeBSD.org>

Remove de_lock field from denode structure and make msdosfs PDIRUNLOCK aware.


# a18b1f1d 03-Oct-2000 Jason Evans <jasone@FreeBSD.org>

Convert lockmgr locks from using simple locks to using mutexes.

Add lockdestroy() and appropriate invocations, which corresponds to
lockinit() and must be called to clean up after a lockmgr lock is no
longer needed.


# 012c643d 29-Aug-2000 Robert Watson <rwatson@FreeBSD.org>

o Restructure vaccess() so as to check for DAC permission to modify the
object before falling back on privilege. Make vaccess() accept an
additional optional argument, privused, to determine whether
privilege was required for vaccess() to return 0. Add commented
out capability checks for reference. Rename some variables to make
it more clear which modes/uids/etc are associated with the object,
and which with the access mode.
o Update file system use of vaccess() to pass NULL as the optional
privused argument. Once additional patches are applied, suser()
will no longer set ASU, so privused will permit passing of
privilege information up the stack to the caller.

Reviewed by: bde, green, phk, -security, others
Obtained from: TrustedBSD Project


# e39c53ed 20-Aug-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Centralize the canonical vop_access user/group/other check in vaccess().

Discussed with: bde


# 4ebb509c 14-Jul-2000 David Malone <dwmalone@FreeBSD.org>

Certain error contitions cause msdosfs_rename() to decrement the
vnode reference count on 'fdvp' more times than it should.

PR: 17347
Submitted by: Ian Dowse <iedowse@maths.tcd.ie>
Approved by: bde


# b1bd38b3 24-Jun-2000 Boris Popov <bp@FreeBSD.org>

Remove obsolete comment.

Submitted by: Marius Bendiksen <mbendiks@eunet.no>


# 9626b608 05-May-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by: peter


# 2c9b67a8 30-Apr-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Remove unneeded #include <vm/vm_zone.h>

Generated by: src/tools/tools/kerninclude


# 8177437d 14-Apr-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Complete the bio/buf divorce for all code below devfs::strategy

Exceptions:
Vinum untouched. This means that it cannot be compiled.
Greg Lehey is on the case.

CCD not converted yet, casts to struct buf (still safe)

atapi-cd casts to struct buf to examine B_PHYS


# c244d2de 02-Apr-2000 Poul-Henning Kamp <phk@FreeBSD.org>

Move B_ERROR flag to b_ioflags and call it BIO_ERROR.

(Much of this done by script)

Move B_ORDERED flag to b_ioflags and call it BIO_ORDERED.

Move b_pblkno and b_iodone_chain to struct bio while we transition, they
will be obsoleted once bio structs chain/stack.

Add bio_queue field for struct bio aware disksort.

Address a lot of stylistic issues brought up by bde.


# 37713edc 08-Jan-2000 Boris Popov <bp@FreeBSD.org>

Treat negative uio_offset value as eof (idea by: bde).
Prevent overflows by casting uio_offset to uoff_t.
Return correct error number if directory entry is broken.

Reviewed by: bde


# 70852092 01-Jan-2000 Boris Popov <bp@FreeBSD.org>

Fix the mess with signed/unsigned longs and ints (inspired by bde).
Fix potential bug with directory reading.
Explicitly limit file size to 4GB (msdos can't handle larger files).
Slightly reorganize msdosfs_read() to reduce number of 'if's.


# 687fce03 28-Dec-1999 Boris Popov <bp@FreeBSD.org>

Avoid to write garbage if uiomove fails.


# dc22f85f 28-Dec-1999 Boris Popov <bp@FreeBSD.org>

Fix an overflow in the msdosfs_read() function which exposed on the files
with size > 2GB.

PR: 15639
Submitted by: Tim Kientzle <kientzle@acm.org>
Reviewed by: phk


# 762e6b85 15-Dec-1999 Eivind Eklund <eivind@FreeBSD.org>

Introduce NDFREE (and remove VOP_ABORTOP)


# 67ddfcaf 20-Sep-1999 Matthew Dillon <dillon@FreeBSD.org>

More removals of vnode->v_lastr, replaced by preexisting seqcount
heuristic to detect sequential operation.

VM-related forced clustering code removed from ufs in preparation for a
commit to vm/vm_fault.c that does it more generally.

Reviewed by: David Greenman <dg@root.com>, Alan Cox <alc@cs.rice.edu>


# c3aac50f 27-Aug-1999 Peter Wemm <peter@FreeBSD.org>

$Id$ -> $FreeBSD$


# 0ef1c826 08-Aug-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>,
a few lines into <sys/vnode.h>.

Add a few fields to struct specinfo, paving the way for the fun part.


# a2a0b22c 24-Jul-1999 Bruce Evans <bde@FreeBSD.org>

Don't set DE_ACCESS for unsuccessful reads.
Translated from: a similar fix in ufs_readwrite.c rev.1.61.

Don't forget to set DE_ACCESS for short reads.

Check for invalid (negative) offsets before checking for reads of
0 bytes, as in ufs, although checking for invalid offsets at all
is probably a bug.


# 67812eac 25-Jun-1999 Kirk McKusick <mckusick@FreeBSD.org>

Convert buffer locking from using the B_BUSY and B_WANTED flags to using
lockmgr locks. This commit should be functionally equivalent to the old
semantics. That is, all buffer locking is done with LK_EXCLUSIVE
requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will
be done in future commits.


# bfbb9ce6 11-May-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Divorce "dev_t" from the "major|minor" bitmap, which is now called
udev_t in the kernel but still called dev_t in userland.

Provide functions to manipulate both types:
major() umajor()
minor() uminor()
makedev() umakedev()
dev2udev() udev2dev()

For now they're functions, they will become in-line functions
after one of the next two steps in this process.

Return major/minor/makedev to macro-hood for userland.

Register a name in cdevsw[] for the "filedescriptor" driver.

In the kernel the udev_t appears in places where we have the
major/minor number combination, (ie: a potential device: we
may not have the driver nor the device), like in inodes, vattr,
cdevsw registration and so on, whereas the dev_t appears where
we carry around a reference to a actual device.

In the future the cdevsw and the aliased-from vnode will be hung
directly from the dev_t, along with up to two softc pointers for
the device driver and a few houskeeping bits. This will essentially
replace the current "alias" check code (same buck, bigger bang).

A little stunt has been provided to try to catch places where the
wrong type is being used (dev_t vs udev_t), if you see something
not working, #undef DEVT_FASCIST in kern/kern_conf.c and see if
it makes a difference. If it does, please try to track it down
(many hands make light work) or at least try to reproduce it
as simply as possible, and describe how to do that.

Without DEVT_FASCIST I belive this patch is a no-op.

Stylistic/posixoid comments about the userland view of the <sys/*.h>
files welcome now, from userland they now contain the end result.

Next planned step: make all dev_t's refer to the same devsw[] which
means convert BLK's to CHR's at the perimeter of the vnodes and
other places where they enter the game (bootdev, mknod, sysctl).


# dfd5dee1 06-May-1999 Peter Wemm <peter@FreeBSD.org>

Add sufficient braces to keep egcs happy about potentially ambiguous
if/else nesting.


# 75c13541 28-Apr-1999 Poul-Henning Kamp <phk@FreeBSD.org>

This Implements the mumbled about "Jail" feature.

This is a seriously beefed up chroot kind of thing. The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact: "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

I have no scripts for setting up a jail, don't ask me for them.

The IP number should be an alias on one of the interfaces.

mount a /proc in each jail, it will make ps more useable.

/proc/<pid>/status tells the hostname of the prison for
jailed processes.

Quotas are only sensible if you have a mountpoint per prison.

There are no privisions for stopping resource-hogging.

Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by: http://www.rndassociates.com/
Run for almost a year by: http://www.servetheweb.com/


# f711d546 27-Apr-1999 Poul-Henning Kamp <phk@FreeBSD.org>

Suser() simplification:

1:
s/suser/suser_xxx/

2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.

3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/

The remaining suser_xxx() calls will be scrutinized and dealt with
later.

There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.

More changes to the suser() API will come along with the "jail" code.


# 831a80b0 27-Jan-1999 Matthew Dillon <dillon@FreeBSD.org>

Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile


# f1d19042 07-Dec-1998 Archie Cobbs <archie@FreeBSD.org>

The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static
and local variables, goto labels, and functions declared but not defined.


# 5c592627 29-Nov-1998 Dmitrij Tejblum <dt@FreeBSD.org>

Honor MNT_NOATIME.

PR: 8383
Submitted by: Carl Mascott <cmascott@world.std.com>


# bad3d41d 20-Nov-1998 Dmitrij Tejblum <dt@FreeBSD.org>

Support NT VFAT lower case flags.

PR: 8383
(Mostly) Submitted by: Carl Mascott <cmascott@world.std.com>


# 40c8cfe5 31-Oct-1998 Peter Wemm <peter@FreeBSD.org>

Use TAILQ macros for clean/dirty block list processing. Set b_xflags
rather than abusing the list next pointer with a magic number.


# e27b047c 13-Sep-1998 Dmitrij Tejblum <dt@FreeBSD.org>

Fix a bug related to renaming in root directory. This bug reported by
Cejka Rudolf <cejkar@dcse.fee.vutbr.cz> on freebsd-current in Messaage-Id
<199807141023.MAA09803@kazi.dcse.fee.vutbr.cz>.

Reviewed by: bde


# ac1e407b 11-Jul-1998 Bruce Evans <bde@FreeBSD.org>

Fixed printf format errors.


# fd5d1124 04-Jul-1998 Julian Elischer <julian@FreeBSD.org>

VOP_STRATEGY grows an (struct vnode *) argument
as the value in b_vp is often not really what you want.
(and needs to be frobbed). more cleanups will follow this.
Reviewed by: Bruce Evans <bde@freebsd.org>


# 40b0939a 10-Jun-1998 Dmitrij Tejblum <dt@FreeBSD.org>

Back out previous change. This behavior is at least completely
"susv2"-compliant.


# 75065d2c 10-Jun-1998 Dmitrij Tejblum <dt@FreeBSD.org>

Also return EOPNOTSUPP rather than EINVAL for not supported owner and group
changes.


# c3a24f69 10-Jun-1998 Peter Wemm <peter@FreeBSD.org>

Return EOPNOTSUPP rather than EINVAL for flags that are not supported.


# ffbdd68e 09-Jun-1998 Dmitrij Tejblum <dt@FreeBSD.org>

Fix typo in a comment.


# 058ef246 17-May-1998 Dmitrij Tejblum <dt@FreeBSD.org>

Remove bogus LK_RETRY.

Submitted by: bde


# 7be2d300 06-May-1998 Mike Smith <msmith@FreeBSD.org>

In the words of the submitter:

---------
Make callers of namei() responsible for releasing references or locks
instead of having the underlying filesystems do it. This eliminates
redundancy in all terminal filesystems and makes it possible for stacked
transport layers such as umapfs or nullfs to operate correctly.

Quality testing was done with testvn, and lat_fs from the lmbench suite.

Some NFS client testing courtesy of Patrik Kudo.

vop_mknod and vop_symlink still release the returned vpp. vop_rename
still releases 4 vnode arguments before it returns. These remaining cases
will be corrected in the next set of patches.
---------

Submitted by: Michael Hancock <michaelh@cet.co.jp>


# a0502b19 26-Mar-1998 Poul-Henning Kamp <phk@FreeBSD.org>

Add two new functions, get{micro|nano}time.

They are atomic, but return in essence what is in the "time" variable.
gettime() is now a macro front for getmicrotime().

Various patches to use the two new functions instead of the various
hacks used in their absence.

Some puntuation and grammer patches from Bruce.

A couple of XXX comments.


# a8e44116 19-Mar-1998 KATO Takenori <kato@FreeBSD.org>

Deleted 1024bytes/sector floppy code for PC-98 arch. The
1024bytes/sector code has not worked for long time and it should be
re-implemented.


# 651ae11e 06-Mar-1998 Mike Smith <msmith@FreeBSD.org>

Trivial filesystem getpages/putpages implementations, set the second.
These should be considered the first steps in a work-in-progress.
Submitted by: Terry Lambert <terry@freebsd.org>


# 3f75478c 02-Mar-1998 Mike Smith <msmith@FreeBSD.org>

Patch to the last commit; attempt to unspam stuff from NetBSD.
Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>


# ea0665cd 01-Mar-1998 Mike Smith <msmith@FreeBSD.org>

Fix mmap() on msdosfs. In the words of the submitter:

|In the process of evaluating the getpages/putpages issues I discovered
|that mmap on MSDOSFS does not work. This is because I blindly merged
|NetBSD changes in msdosfs_bmap and msdosfs_strategy. Apparently, their
|blocksize is always DEV_BSIZE (even in files), while in FreeBSD
|blocksize in files is v_mount->mnt_stat.f_iosize (i.e. clustersize in
|MSDOSFS case). The patch is below.

Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>


# 69a36f24 25-Feb-1998 Mike Smith <msmith@FreeBSD.org>

Fixes for some bugs in the VFAT/FAT32 support:

- 'mv longnamedfile1 longnamedfile2' would cause longnamedfile2 to lose its
long name.
- Long names have trailing spaces/dots stripped for lookup as well as
assignment.
- A lockup when the mdsosfs was accessed from within the Linux emulator is fixed.
- A bug whereby long filenames were recognised by Microsoft operating systems but
not FreeBSD is fixed.

Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>


# 5114f1d7 24-Feb-1998 Andrey A. Chernov <ache@FreeBSD.org>

Back out "always view in lowercase" part
Return to previous variant "comparing in lowercase" in winChkName


# 7391f611 23-Feb-1998 Andrey A. Chernov <ache@FreeBSD.org>

Implement loadable DOS<->local conversion tables for DOS names
Always create DOS name in uppercase
Always view DOS name in lowercase


# b998efa9 23-Feb-1998 Andrey A. Chernov <ache@FreeBSD.org>

Implement loadable upper->lower local conversion table


# a4f93748 22-Feb-1998 Andrey A. Chernov <ache@FreeBSD.org>

Reduce new arguments number added in my changes


# 13df76f2 22-Feb-1998 Andrey A. Chernov <ache@FreeBSD.org>

Implement loadable local<->unicode file names conversion
Note: it produce correct names only for Win95, DOS names are still
incorrect and need similar work
mount_msdos support coming soon


# 952a6212 18-Feb-1998 Jordan K. Hubbard <jkh@FreeBSD.org>

Update MSDOSFS code using NetBSD's msdosfs as a guide to support
FAT32 partitions. Unfortunately, we looked around here at
Walnut Creek CDROM for any newer FAT32-supporting versions
of Win95 and we were unsuccessful; only the older stuff here.
So this is untested beyond simply making sure it compiles and
someone with access to an actual FAT32 fs will have
to let us know how well it actually works.
Submitted by: Dmitrij Tejblum <dima@tejblum.dnttm.rssi.ru>
Obtained from: NetBSD


# 0b08f5f7 05-Feb-1998 Eivind Eklund <eivind@FreeBSD.org>

Back out DIAGNOSTIC changes.


# 47cfdb16 04-Feb-1998 Eivind Eklund <eivind@FreeBSD.org>

Turn DIAGNOSTIC into a new-style option.


# 675ea6f0 26-Dec-1997 Bruce Evans <bde@FreeBSD.org>

Unspammed nested include of <vm/vm_zone.h>.


# ef91bd57 27-Oct-1997 Bruce Evans <bde@FreeBSD.org>

Removed unused #includes. The need for most of them went away with
recent changes (docluster* and vfs improvements).


# dba3870c 26-Oct-1997 Poul-Henning Kamp <phk@FreeBSD.org>

VFS interior redecoration.

Rename vn_default_error to vop_defaultop all over the place.
Move vn_bwrite from vfs_bio.c to vfs_default.c and call it vop_stdbwrite.
Use vop_null instead of nullop.
Move vop_nopoll from vfs_subr.c to vfs_default.c
Move vop_sharedlock from vfs_subr.c to vfs_default.c
Move vop_nolock from vfs_subr.c to vfs_default.c
Move vop_nounlock from vfs_subr.c to vfs_default.c
Move vop_noislocked from vfs_subr.c to vfs_default.c
Use vop_ebadf instead of *_ebadf.
Add vop_defaultop for getpages on master vnode in MFS.


# d54d34b5 16-Oct-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Make a set of VOP standard lock, unlock & islocked VOP operators, which
depend on the lock being located at vp->v_data. Saves 3x3 identical
vop procs, more as the other filesystems becomes lock aware.


# 987f5696 16-Oct-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Another VFS cleanup "kilo commit"

1. Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS}
intereface function, and now lives in the ufsmount structure.

2. Remove VOP_SEEK, it was unused.

3. Add mode default vops:

VOP_ADVLOCK vop_einval
VOP_CLOSE vop_null
VOP_FSYNC vop_null
VOP_IOCTL vop_enotty
VOP_MMAP vop_einval
VOP_OPEN vop_null
VOP_PATHCONF vop_einval
VOP_READLINK vop_einval
VOP_REALLOCBLKS vop_eopnotsupp

And remove identical functionality from filesystems

4. Add vop_stdpathconf, which returns the canonical stuff. Use
it in the filesystems. (XXX: It's probably wrong that specfs
and fifofs sets this vop, shouldn't it come from the "host"
filesystem, for instance ufs or cd9660 ?)

5. Try to make system wide VOP functions have vop_* names.

6. Initialize the um_* vectors in LFS.

(Recompile your LKMS!!!)


# cec0f20c 16-Oct-1997 Poul-Henning Kamp <phk@FreeBSD.org>

VFS mega cleanup commit (x/N)

1. Add new file "sys/kern/vfs_default.c" where default actions for
VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE,
POLL, REVOKE and STRATEGY. Various stuff spread over the entire
tree belongs here.

2. Change VOP_BLKATOFF to a normal function in cd9660.

3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These
are private interface functions between UFS and the underlying
storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now
live in struct ufsmount instead.

4. Remove a kludge of VOP_ functions in all filesystems, that did
nothing but obscure the simplicity and break the expandability.
If a filesystem doesn't implement VOP_FOO, it shouldn't have an
entry for it in its vnops table. The system will try to DTRT
if it is not implemented. There are still some cruft left, but
the bulk of it is done.

5. Fix another VCALL in vfs_cache.c (thanks Bruce!)


# 6a525123 15-Oct-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Hmm, realign the vnops into two columns.


# 539ef70c 15-Oct-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Stylistic overhaul of vnops tables.
1. Remove comment stating the blatantly obvious.
2. Align in two columns.
3. Sort all but the default element alphabetically.
4. Remove XXX comments pointing out entries not needed.


# 99448ed1 20-Sep-1997 John Dyson <dyson@FreeBSD.org>

Change the M_NAMEI allocations to use the zone allocator. This change
plus the previous changes to use the zone allocator decrease the useage
of malloc by half. The Zone allocator will be upgradeable to be able
to use per CPU-pools, and has more intelligent usage of SPLs. Additionally,
it has reasonable stats gathering capabilities, while making most calls
inline.


# a6aeade2 13-Sep-1997 Peter Wemm <peter@FreeBSD.org>

Convert select -> poll.
Delete 'always succeed' select/poll handlers, replaced with generic call.
Flag missing vnode op table entries.


# 0fa2443f 26-Aug-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Uncut&paste cache_lookup().

This unifies several times in theory indentical 50 lines of code.

The filesystems have a new method: vop_cachedlookup, which is the
meat of the lookup, and use vfs_cache_lookup() for their vop_lookup
method. vfs_cache_lookup() will check the namecache and pass on
to the vop_cachedlookup method in case of a miss.

It's still the task of the individual filesystems to populate the
namecache with cache_enter().

Filesystems that do not use the namecache will just provide the
vop_lookup method as usual.


# 8a40593f 17-May-1997 Poul-Henning Kamp <phk@FreeBSD.org>

Remove redundant check for vp == dvp (done in VFS before calling).


# 3bea2218 10-Apr-1997 Bruce Evans <bde@FreeBSD.org>

Get the declaration of `struct dirent' from <sys/dirent.h>, not from
<sys/dir.h>.

Removed unused #include.

Fixed type and order of struct members in pseudo-declaration of `struct
vop_readdir_args'.


# af3f60d5 26-Feb-1997 Bruce Evans <bde@FreeBSD.org>

Updated msdosfs to use Lite2 vfs configuration and Lite2 locking. It
should now work as (un)well as before the Lite2 merge.


# 6875d254 22-Feb-1997 Peter Wemm <peter@FreeBSD.org>

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 8092a8e1 12-Feb-1997 Mike Pritchard <mpp@FreeBSD.org>

Make this compile without warnings after the Lite2 merge:

- *fs_init routines now take a "struct vfsconf * vfsp" pointer
as an argument.
- Use the correct type for cookies.
- Update function prototypes.

Submitted by: bde


# 996c772f 09-Feb-1997 John Dyson <dyson@FreeBSD.org>

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 1130b656 14-Jan-1997 Jordan K. Hubbard <jkh@FreeBSD.org>

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# fdd71f98 25-Dec-1996 Bruce Evans <bde@FreeBSD.org>

Don't synchronously update the directory entry at the end of every
successful write. Only do it for the IO_SYNC case (like ufs). On
one of my systems, this speeds up `iozone 24 512' from 32K/sec
(1/128 as fast as ufs) to 2.8MB/sec (7/10 as fast as ufs).

Obtained from: partly from NetBSD


# 39b1a97c 01-Oct-1996 John Dyson <dyson@FreeBSD.org>

MSDOS FS used to allocate a buffer before extending the VM object. In
certain error conditions, it is possible for pages to be left allocated
in the object beyond it's end. It is generally bad practice to allocate
pages beyond the end of an object.


# 030e2e9e 19-Sep-1996 Nate Williams <nate@FreeBSD.org>

In sys/time.h, struct timespec is defined as:

/*
* Structure defined by POSIX.4 to be like a timeval.
*/
struct timespec {
time_t ts_sec; /* seconds */
long ts_nsec; /* and nanoseconds */
};

The correct names of the fields are tv_sec and tv_nsec.

Reminded by: James Drobina <jdrobina@infinet.com>


# b71fec07 03-Sep-1996 Bruce Evans <bde@FreeBSD.org>

Eliminated nested include of <sys/unistd.h> in <sys/file.h> in the kernel.
Include it directly in the few places where it is used.

Reduced some #includes of <sys/file.h> to #includes of <sys/fcntl.h> or
nothing.


# ad63a118 14-Jun-1996 Satoshi Asami <asami@FreeBSD.org>

The Great PC98 Merge.

All new code is "#ifdef PC98"ed so this should make no difference to
PC/AT (and its clones) users.

Ok'd by: core
Submitted by: FreeBSD(98) development team


# efeaf95a 06-Dec-1995 David Greenman <dg@FreeBSD.org>

Untangled the vm.h include file spaghetti.


# 58c27bcf 03-Dec-1995 Bruce Evans <bde@FreeBSD.org>

Added prototypes.


# ebbcda3e 13-Nov-1995 Bruce Evans <bde@FreeBSD.org>

Fixed getdirentries() on nfs mounted msdosfs's. No cookies were returned
for certain common combinations of directory sizes, cluster sizes, and i/o
sizes (e.g., 4K, 4K, and 4K). The fix in rev. 1.21 was incomplete.

Reviewed by: dfr
Obtained from: party from NetBSD


# f57e6547 09-Nov-1995 Bruce Evans <bde@FreeBSD.org>

Introduced a type `vop_t' for vnode operation functions and used
it 1138 times (:-() in casts and a few more times in declarations.
This change is null for the i386.

The type has to be `typedef int vop_t(void *)' and not `typedef
int vop_t()' because `gcc -Wstrict-prototypes' warns about the
latter. Since vnode op functions are called with args of different
(struct pointer) types, neither of these function types is any use
for type checking of the arg, so it would be preferable not to use
the complete function type, especially since using the complete
type requires adding 1138 casts to avoid compiler warnings and
another 40+ casts to reverse the function pointer conversions before
calling the functions.


# a98ca469 29-Oct-1995 Poul-Henning Kamp <phk@FreeBSD.org>

Second batch of cleanup changes.
This time mostly making a lot of things static and some unused
variables here and there.


# d68a4190 22-Oct-1995 David Greenman <dg@FreeBSD.org>

Moved the filesystem read-only check out of the syscalls and into the
filesystem layer, as was done in lite-2. Merged in some other cosmetic
changes while I was at it. Rewrote most of msdosfs_access() to be more
like ufs_access() and to include the FS read-only check.

Obtained from: partially from 4.4BSD-lite2


# 821692d6 07-Oct-1995 Bruce Evans <bde@FreeBSD.org>

Return EINVAL instead of panicing for rename("dir1", "dir2/..").

Fixes part of PR 760.

This bug seems to be very old.


# c83ebe77 03-Sep-1995 John Dyson <dyson@FreeBSD.org>

Added VOP_GETPAGES/VOP_PUTPAGES and also the "backwards" block count
for VOP_BMAP. Updated affected filesystems...


# 5d81aaee 25-Aug-1995 Bruce Evans <bde@FreeBSD.org>

Fix bogus arg (&p instead of p) in the call to VOP_ACCESS() from
msdosfs_setattr(). The bug was benign because the arg isn't used.


# d2f25f26 02-Aug-1995 Doug Rabson <dfr@FreeBSD.org>

Make sure that a non-null cookie vector is returned even if there were no
valid entries in the block. Doing otherwise confuses the nfs server.


# 94a8606f 02-Aug-1995 Doug Rabson <dfr@FreeBSD.org>

Add support for the va_filerev attribute required by NFSv3.


# 47777413 01-Aug-1995 David Greenman <dg@FreeBSD.org>

Removed my special-case hack for VOP_LINK and fixed the problem with the
wrong vp's ops vector being used by changing the VOP_LINK's argument order.
The special-case hack doesn't go far enough and breaks the generic
bypass routine used in some non-leaf filesystems. Pointed out by Kirk
McKusick.


# 98796526 28-Jun-1995 David Greenman <dg@FreeBSD.org>

Fixed VOP_LINK argument order botch.


# d3628763 11-Jun-1995 Rodney W. Grimes <rgrimes@FreeBSD.org>

Merge RELENG_2_0_5 into HEAD


# 9b2e5354 30-May-1995 Rodney W. Grimes <rgrimes@FreeBSD.org>

Remove trailing whitespace.


# d8762fa6 09-May-1995 Bruce Evans <bde@FreeBSD.org>

Submitted by: Mike Pritchard <pritc003@maroon.tc.umn.edu>

msdosfs_lookup() did no validation to see if the caller was validated
to delete/rename/create files. msdosfs_setattr() did no validation
to see if the caller was allowed to change the file permissions (turn
on/off the write bit) or update the file modification time (utimes).

The routines were fixed to validate the calls just like ufs does.


# d99adf96 11-Apr-1995 Andrey A. Chernov <ache@FreeBSD.org>

Fix link sys call
Submitted by: pritc003@maroon.tc.umn.edu


# edf8a815 19-Mar-1995 David Greenman <dg@FreeBSD.org>

Removed redundant newlines that were in some panic strings.


# 0d94caff 09-Jan-1995 David Greenman <dg@FreeBSD.org>

These changes embody the support of the fully coherent merged VM buffer cache,
much higher filesystem I/O performance, and much better paging performance. It
represents the culmination of over 6 months of R&D.

The majority of the merged VM/cache work is by John Dyson.

The following highlights the most significant changes. Additionally, there are
(mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to
support the new VM/buffer scheme.

vfs_bio.c:
Significant rewrite of most of vfs_bio to support the merged VM buffer cache
scheme. The scheme is almost fully compatible with the old filesystem
interface. Significant improvement in the number of opportunities for write
clustering.

vfs_cluster.c, vfs_subr.c
Upgrade and performance enhancements in vfs layer code to support merged
VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.

vm_object.c:
Yet more improvements in the collapse code. Elimination of some windows that
can cause list corruption.

vm_pageout.c:
Fixed it, it really works better now. Somehow in 2.0, some "enhancements"
broke the code. This code has been reworked from the ground-up.

vm_fault.c, vm_page.c, pmap.c, vm_object.c
Support for small-block filesystems with merged VM/buffer cache scheme.

pmap.c vm_map.c
Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of
kernel PTs.

vm_glue.c
Much simpler and more effective swapping code. No more gratuitous swapping.

proc.h
Fixed the problem that the p_lock flag was not being cleared on a fork.

swap_pager.c, vnode_pager.c
Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the
code doesn't need it anymore.

machdep.c
Changes to better support the parameter values for the merged VM/buffer cache
scheme.

machdep.c, kern_exec.c, vm_glue.c
Implemented a seperate submap for temporary exec string space and another one
to contain process upages. This eliminates all map fragmentation problems
that previously existed.

ffs_inode.c, ufs_inode.c, ufs_readwrite.c
Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on
busy buffers.

Submitted by: John Dyson and David Greenman


# a176f73d 26-Dec-1994 Bruce Evans <bde@FreeBSD.org>

Fix panic for `cp -p' by root to an msdos file system. Improve handling
of attributes so that `cp -p' to an msdos file system can succeed under
favourable circumstances (no uid or gid changes and no nonzero flags
except SF_ARCHIVED).

msdosfs_vnops.c:
The in-core inode flags were confused with the on-disk inode flags, so
chflags() clobbered the lock flag and caused a panic.

denode.h, msdosfs_denode.c, msdosfs_vnops.c:
Support the msdosfs archive attibute (ATTR_ARCHIVE) by mapping it to
the complement of the SF_ARCHIVED flag and setting the ATTR_ARCHIVE
bit when a file's modification time is set (but not when a file's
permissions are set; this is the standard wrong DOS behaviour).

denode.h, msdosfs_denode.c:
Remove the DE_UPDAT() macro. It was only used once, and the corresponding
macro in ufs has already been removed.

denode.h:
Don't change the timestamp for directories in DE_TIMES() (be consistent
with deupdat()).

msdosfs_vnops.c:
Handle chown() better: return EPERM instead of EINVAL if there are
insufficient permissions; otherwise, allow null changes.


# 63e4f22a 11-Dec-1994 Bruce Evans <bde@FreeBSD.org>

Fix numerous timestamp bugs.

DE_UPDATE was confused with DE_MODIFIED in some places (they do have
confusing names). Handle them exactly the same as IN_UPDATE and
IN_MODIFIED. This fixes chmod() and chown() clobbering the mtime
and other bugs.

DE_MODIFIED was set but not used.

Parenthesize macro args.

DE_TIMES() now takes a timeval arg instead of a timespec arg. It was
stupid to use a macro for speed and do unused conversions to prepare
for the macro.

Restore the left shifting of the DOS seconds count by 1. It got
lost among the shifts for the bitfields, so DOS seconds counts
appeared to range from 0 to 29 seconds (step 1) instead of 0 to 58
seconds (step 2).

Actually use the passed-in mtime in deupdat() as documented so that
utimes() works.

Change `extern __inline's to `static inline's so that msdosfs_fat.o
can be linked when it is compiled without -O.

Remove faking of directory mtimes to always be the current time. It's
more surprising for directory mtimes to change when you read the
directories than for them not to change when you write the directories.
This should be controlled by a mount-time option if at all.


# 5bddf43a 29-Nov-1994 Andrey A. Chernov <ache@FreeBSD.org>

Restore mv check, cause panic without it
Submitted by: Ade Barkah


# 6b4ec83c 01-Nov-1994 Jordan K. Hubbard <jkh@FreeBSD.org>

Fix from John Hay to avoid kernel panics when ap->a_eofflag is NULL.
I'm not sure if this is just masking another problem (like, should
ap->a_eofflag EVER be NULL?), but if it prevents a panic for now then
it may save an ALPHA customer.
Submitted by: jhay


# 6213f6fc 27-Oct-1994 Paul Traina <pst@FreeBSD.org>

Set the EOF flag properly.
Obtained from: netbsd-bugs mailing list


# a4c45b75 22-Oct-1994 Martin Renters <martin@FreeBSD.org>

Fixed panic when unmounting floppy msdos filesystems. Problem was
we weren't flushing dirty buffers. Fix stolen from ffs_fsync()


# 82478919 06-Oct-1994 David Greenman <dg@FreeBSD.org>

Use tsleep() rather than sleep so that 'ps' is more informative about
the wait.


# 9abf4d6e 28-Sep-1994 Doug Rabson <dfr@FreeBSD.org>

Make NFS ask the filesystems for directory cookies instead of making them
itself.


# c3c6d51e 27-Sep-1994 Poul-Henning Kamp <phk@FreeBSD.org>

Added declarations, fixed bugs due to missing decls. At least one of them
could panic a system. (I know, it paniced mine!).


# c901836c 20-Sep-1994 Garrett Wollman <wollman@FreeBSD.org>

Implemented loadable VFS modules, and made most existing filesystems
loadable. (NFS is a notable exception.)


# 27a0bc89 19-Sep-1994 Doug Rabson <dfr@FreeBSD.org>

Added msdosfs.

Obtained from: NetBSD