History log of /freebsd-11-stable/sys/fs/ext2fs/ext2_vfsops.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 350385 27-Jul-2019 fsu

MFC r349800,r349801,r349802:

Fix misc fs fuzzing issues.

Reported by: Christopher Krah, Thomas Barabosch, and Jan-Niclas Hilgert of Fraunhofer FKIE
Reported as: FS-27-EXT2-12: Denial of Service in openat-0 (vm_fault_hold/ext2_clusteracct)
FS-22-EXT2-9: Denial of service in ftruncate-0 (ext2_balloc)
FS-11-EXT2-6: Denial Of Service in write-1 (ext2_balloc)


# 346189 13-Apr-2019 pfg

MFC r344755 (by fsu@)
Fix integer overflow possibility.

Reported by: Christopher Krah, Thomas Barabosch, and Jan-Niclas Hilgert
of Fraunhofer FKIE
Reported as: FS-2-EXT2-1: Out-of-Bounds Write in nmount (ext2_vget)
Reviewed by: pfg
MFC after: 1 week

Differential Revision: https://reviews.freebsd.org/D19326


# 343517 28-Jan-2019 pfg

MFC r343459: (parcial)
ext2fs: Add some extra consistency checks for the superblock.

Maliciously formed, or badly corrupted, filesystems can cause kernel
panics. In general, such acts of foot-shooting can only be accomplished
by root, but in a world with VM images that is moving towards automated
mounts it is important to have some form of prevention.

Reported by: Christopher Krah, Thomas Barabosch, and Jan-Niclas Hilgert
of Fraunhofer FKIE.
Incidentaly this should also fix a memory corruption issue reported by
Dr Silvio Cesare of InfoSect.

Huge thanks to all reseachers for making us aware of the issue.

Note: for the MFC to stable/11 several changes had to made.

admbug: 872, 891
Reviewed by: fsu
Obtained from: NetBSD (with changes)


# 331722 29-Mar-2018 eadler

Revert r330897:

This was intended to be a non-functional change. It wasn't. The commit
message was thus wrong. In addition it broke arm, and merged crypto
related code.

Revert with prejudice.

This revert skips files touched in r316370 since that commit was since
MFCed. This revert also skips files that require $FreeBSD$ property
changes.

Thank you to those who helped me get out of this mess including but not
limited to gonzo, kevans, rgrimes.

Requested by: gjb (re)


# 330897 14-Mar-2018 eadler

Partial merge of the SPDX changes

These changes are incomplete but are making it difficult
to determine what other changes can/should be merged.

No objections from: pfg


# 322738 20-Aug-2017 pfg

MFC r320578:

ext2fs: be more verbose about unsupported ext2fs features.

It is useful to know exactly what features may be lacking when trying to
mount ext4 filesystems.

Submitted by: Fedor Uporov


# 322712 20-Aug-2017 pfg

MFC r320145:
ext2fs: Add uninit_bg feature support.

From the linux tune2fs(8) manpage:
"Allow the kernel to initialize bitmaps and inode tables and keep a high
watermark for the unused inodes in a filesystem, to reduce e2fsck(8) time.
This first e2fsck run after enabling this feature will take the full time,
but subsequent e2fsck runs will take only a fraction of the original time,
depending on how full the file system is."

Submitted by: Fedor Uporov


# 311231 04-Jan-2017 pfg

MFC r310705, r310706:
style(9) cleanups.

Just to reduce some of the issues found with indent(1).


# 309455 02-Dec-2016 pfg

MFC r309179:
ext2fs: avoid possible overflow when calculating malloc size.

This is inspired on r308064 for the case of mounting UFS.


# 302408 07-Jul-2016 gjb

Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle.
Prune svn:mergeinfo from the new branch, as nothing has been merged
here.

Additional commits post-branch will follow.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


/freebsd-11-stable/MAINTAINERS
/freebsd-11-stable/cddl
/freebsd-11-stable/cddl/contrib/opensolaris
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs
/freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs
/freebsd-11-stable/contrib/amd
/freebsd-11-stable/contrib/apr
/freebsd-11-stable/contrib/apr-util
/freebsd-11-stable/contrib/atf
/freebsd-11-stable/contrib/binutils
/freebsd-11-stable/contrib/bmake
/freebsd-11-stable/contrib/byacc
/freebsd-11-stable/contrib/bzip2
/freebsd-11-stable/contrib/com_err
/freebsd-11-stable/contrib/compiler-rt
/freebsd-11-stable/contrib/dialog
/freebsd-11-stable/contrib/dma
/freebsd-11-stable/contrib/dtc
/freebsd-11-stable/contrib/ee
/freebsd-11-stable/contrib/elftoolchain
/freebsd-11-stable/contrib/elftoolchain/ar
/freebsd-11-stable/contrib/elftoolchain/brandelf
/freebsd-11-stable/contrib/elftoolchain/elfdump
/freebsd-11-stable/contrib/expat
/freebsd-11-stable/contrib/file
/freebsd-11-stable/contrib/gcc
/freebsd-11-stable/contrib/gcclibs/libgomp
/freebsd-11-stable/contrib/gdb
/freebsd-11-stable/contrib/gdtoa
/freebsd-11-stable/contrib/groff
/freebsd-11-stable/contrib/ipfilter
/freebsd-11-stable/contrib/ldns
/freebsd-11-stable/contrib/ldns-host
/freebsd-11-stable/contrib/less
/freebsd-11-stable/contrib/libarchive
/freebsd-11-stable/contrib/libarchive/cpio
/freebsd-11-stable/contrib/libarchive/libarchive
/freebsd-11-stable/contrib/libarchive/libarchive_fe
/freebsd-11-stable/contrib/libarchive/tar
/freebsd-11-stable/contrib/libc++
/freebsd-11-stable/contrib/libc-vis
/freebsd-11-stable/contrib/libcxxrt
/freebsd-11-stable/contrib/libexecinfo
/freebsd-11-stable/contrib/libpcap
/freebsd-11-stable/contrib/libstdc++
/freebsd-11-stable/contrib/libucl
/freebsd-11-stable/contrib/libxo
/freebsd-11-stable/contrib/llvm
/freebsd-11-stable/contrib/llvm/projects/libunwind
/freebsd-11-stable/contrib/llvm/tools/clang
/freebsd-11-stable/contrib/llvm/tools/lldb
/freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump
/freebsd-11-stable/contrib/llvm/tools/llvm-lto
/freebsd-11-stable/contrib/mdocml
/freebsd-11-stable/contrib/mtree
/freebsd-11-stable/contrib/ncurses
/freebsd-11-stable/contrib/netcat
/freebsd-11-stable/contrib/ntp
/freebsd-11-stable/contrib/nvi
/freebsd-11-stable/contrib/one-true-awk
/freebsd-11-stable/contrib/openbsm
/freebsd-11-stable/contrib/openpam
/freebsd-11-stable/contrib/openresolv
/freebsd-11-stable/contrib/pf
/freebsd-11-stable/contrib/sendmail
/freebsd-11-stable/contrib/serf
/freebsd-11-stable/contrib/sqlite3
/freebsd-11-stable/contrib/subversion
/freebsd-11-stable/contrib/tcpdump
/freebsd-11-stable/contrib/tcsh
/freebsd-11-stable/contrib/tnftp
/freebsd-11-stable/contrib/top
/freebsd-11-stable/contrib/top/install-sh
/freebsd-11-stable/contrib/tzcode/stdtime
/freebsd-11-stable/contrib/tzcode/zic
/freebsd-11-stable/contrib/tzdata
/freebsd-11-stable/contrib/unbound
/freebsd-11-stable/contrib/vis
/freebsd-11-stable/contrib/wpa
/freebsd-11-stable/contrib/xz
/freebsd-11-stable/crypto/heimdal
/freebsd-11-stable/crypto/openssh
/freebsd-11-stable/crypto/openssl
/freebsd-11-stable/gnu/lib
/freebsd-11-stable/gnu/usr.bin/binutils
/freebsd-11-stable/gnu/usr.bin/cc/cc_tools
/freebsd-11-stable/gnu/usr.bin/gdb
/freebsd-11-stable/lib/libc/locale/ascii.c
/freebsd-11-stable/sys/cddl/contrib/opensolaris
/freebsd-11-stable/sys/contrib/dev/acpica
/freebsd-11-stable/sys/contrib/ipfilter
/freebsd-11-stable/sys/contrib/libfdt
/freebsd-11-stable/sys/contrib/octeon-sdk
/freebsd-11-stable/sys/contrib/x86emu
/freebsd-11-stable/sys/contrib/xz-embedded
/freebsd-11-stable/usr.sbin/bhyve/atkbdc.h
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h
/freebsd-11-stable/usr.sbin/bhyve/console.c
/freebsd-11-stable/usr.sbin/bhyve/console.h
/freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h
/freebsd-11-stable/usr.sbin/bhyve/rfb.c
/freebsd-11-stable/usr.sbin/bhyve/rfb.h
/freebsd-11-stable/usr.sbin/bhyve/sockstream.c
/freebsd-11-stable/usr.sbin/bhyve/sockstream.h
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.c
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.h
/freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c
/freebsd-11-stable/usr.sbin/bhyve/vga.c
/freebsd-11-stable/usr.sbin/bhyve/vga.h
# 301569 07-Jun-2016 pfg

ext2fs: Stop dropping and reacquiring Giant around geom calls.

As in UFS r300366.


# 301548 07-Jun-2016 pfg

ext2fs: cleanup generation number management.

Ext2/3/4 manages generation numbers differently than UFS so adopt
some rules that should work well. When allocating a new inode,
make sure we generate a "good" random value specifically avoiding
zero.

Don't interfere with the numbers that are already generated in
the filesystem: ext2fs doesn't have the backwards compatibility
issues where there were no generation numbers.

Reviewed by: kevlo
MFC after: 1 week


# 300423 22-May-2016 kevlo

arc4random() returns 0 to (2**32)−1, use an alternative to initialize
i_gen if it's zero rather than a divide by 2.

With inputs from delphij, mckusick, rmacklem

Reviewed by: mckusick


# 298609 25-Apr-2016 pfg

ext2fs: make use of the howmany() macro when available.

We have a howmany() macro in the <sys/param.h> header that is
convenient to re-use as it makes things easier to read.

MFC after: 2 weeks


# 297695 08-Apr-2016 kevlo

Fix comment.


# 294504 21-Jan-2016 pfg

ext2fs: Bring back the htree dir_index implementation.

The htree dir_index is perhaps one of the most characteristic
features of the linux ext3 implementation. It was removed
in r281670, due to repeated bug reports.

Damjan Jovanic detected and fixed three bugs and did some
stress testing by building Apache OpenOffice on top of it
so it is now in good shape to bring back.

Differential Revision: https://reviews.freebsd.org/D5007

Submitted by: Damjan Jovanovic
Reviewed by: pfg
Tested by: pho
Relnotes: Yes
MFC after: 2 months (only 10.x)


# 293683 11-Jan-2016 pfg

ext4: mount panic from freeing invalid pointers

Initialize the struct with those fields to zeroes on allocation,
preventing the panic.

Patch by: Damjan Jovanovic.

PR: 206056
MFC after: 3 days


# 281670 17-Apr-2015 pfg

Drop experimental dir_index support.

The htree directory index is a highly desirable feature for research
purposes and was meant to improve performance in our ext2/3 driver.
Unfortunately our implementation has two problems:

- It never really delivered any performance improvement.
- It appears to corrupt the filesystem in undetermined circumstances.

Strictly speaking dir_index is not required for read/write support in
ext2/3 and our limited ext4 support still works fine without it.

Regain stability in the ext2 driver by removing it. We may need it back
(fixed) if we want to support encrypted ext4 support but thanks to the
wonders of version control we can always revert this change and bring it
back.

PR: 191895
PR: 198731
PR: 199309

MFC after: 5 days


# 281562 15-Apr-2015 rmacklem

File systems that do not use the buffer cache (such as ZFS) must
use VOP_FSYNC() to perform the NFS server's Commit operation.
This patch adds a mnt_kern_flag called MNTK_USES_BCACHE which
is set by file systems that use the buffer cache. If this flag
is not set, the NFS server always does a VOP_FSYNC().
This should be ok for old file system modules that do not set
MNTK_USES_BCACHE, since calling VOP_FSYNC() is correct, although
it might not be optimal for file systems that use the buffer cache.

Reviewed by: kib
MFC after: 2 weeks


# 278802 15-Feb-2015 pfg

ext2fs: Plug small memory leak

free() e2fs_contigdirs upon error.
Undo zeroing of e2fs_gd as this was actually a false positive.

X-MFC with: 278790


# 278790 14-Feb-2015 pfg

Initialize the allocation of variables related to the ext2 allocator.

The e2fs_gd struct was not being initialized and garbage was
being used for hinting the ext2 allocator variant.
Use malloc to clear the values and also initialize e2fs_contigdirs
during allocation to keep consistency.

While here clean up small style issues.

Reported by: Clang static analyser
MFC after: 1 week


# 277354 19-Jan-2015 pfg

ext2: Garbage-collect some unused variables

Reported by: clang static analysis
MFC after: 2 weeks


# 274437 12-Nov-2014 pfg

ifdef ext2_print_inode which is not really used.

ext2_print_inode is not really used but it was nice to
have for initial development work. #ifdef it under a
new EXT2FS_DEBUG knob so that we don't spend time
compiling it.

MFC after: 3 days


# 262623 28-Feb-2014 pfg

ext2fs: use of tab vs spaces.

Consistently use a single tab after a #define as mentioned in style(9).
Use tabs instead of space for indenting.
Fix a typo: "hash_vesion".

No functional change.

MFC after: 3 days


# 262346 22-Feb-2014 pfg

ext2fs: fully enable ext4 read-only support.

The ext4 developers tend to tag Ext4-specific flags as
"incompatible" even when such features are not relevant for
read-only support. This is a consequence of the process
though which this filesystem is implemented without design
and the fact that some new features are not extensible to
ext2/3.

Organize the features according to what we support and sort
them so that we can now read-only mount filesystems with
some features that may be found in newly formatted ext4 fs.

Submitted by: Zheng Liu
Reviewed by: pfg
MFC after: 5 days


# 261235 28-Jan-2014 pfg

ext2fs: Use i_flag instead of i_flags for Ext4 inode flags.

The ext4 inode flags do not have equivalents for chflags (1)
and hold information that is private to the implementation.
The i_flag field in the inode is a better place to hold the Ext4
inode flags as it saves us from masking flags while setting or
getting attributes. It should also make things cleaner if we
implement write support for Ext4.

Suggested by: bde
Tested by: Mike Ma
MFC after: 3 days


# 260988 21-Jan-2014 pfg

ext2fs: Translate the EXT4_EXTENTS and EXT4_INDEX to the inode flags.

r260545 cleared the inode flags to fix corruption problems but
we still need to pass some EXT4 flags for the ext4 read-only
mode. None of these attributes has an equivalent in FreeBSD and
are uninteresting for the system utilities so they should be
innaccessible in ext2_getattrib().

Note: we also use EXT4_HUGE_FILE but we use it directly from the
dinode structure so it is not necessary to translate it,

Suggested by: bde
MFC after: 3 days


# 254260 12-Aug-2013 pfg

Add read-only support for extents in ext2fs.

Basic support for extents was implemented by Zheng Liu as part
of his Google Summer of Code in 2010. This support is read-only
at this time.

In addition to extents we also support the huge_file extension
for read-only purposes. This works nicely with the additional
support for birthtime/nanosec timestamps and dir_index that
have been added lately.

The implementation may not work for all ext4 filesystems as
it doesn't support some features that are being enabled by
default on recent linux like flex_bg. Nevertheless, the feature
should be very useful for migration or simple access in
filesystems that have been converted from ext2/3 or don't use
incompatible features.

Special thanks to Zheng Liu for his dedication and continued
work to support ext2 in FreeBSD.

Submitted by: Zheng Liu (lz@)
Reviewed by: Mike Ma, Christoph Mallon (previous version)
Sponsored by: Google Inc.
MFC after: 3 weeks


# 252890 06-Jul-2013 pfg

Initial implementation of the HTree directory index.

This is a port of NetBSD's GSoC 2012 Ext3 HTree directory indexing
by Vyacheslav Matyushin. It was cleaned up and enhanced for FreeBSD
by Zheng Liu (lz@).

This is an excellent example of work shared among different projects:
Vyacheslav was able to look at an early prototype from Zheng Liu who
was also able to check the code from Haiku (with permission).

As in linux, the feature is not available by default and must be
enabled explicitly with tune2fs. We still do not support the
workarounds required in readdir for NFS.

Submitted by: Zheng Liu
Tested by: Mike Ma
Sponsored by: Google Inc.
MFC after: 1 week


# 252397 29-Jun-2013 pfg

ext2fs: Use the complete random() range in i_gen.

i_gen is unsigned in ext2fs so we can handle the complete
32 bits.

MFC after: 1 week


# 251612 11-Jun-2013 pfg

s/file system/filesystem/g

Based on r96755 from UFS.

MFC after: 3 days


# 246564 08-Feb-2013 pfg

ext2fs: Replace redundant EXT2_MIN_BLOCK with EXT2_MIN_BLOCK_SIZE.

Submitted by: Christoph Mallon
MFC after: 2 weeks


# 246563 08-Feb-2013 pfg

ext2fs: make e2fs_maxcontig local and remove tautological check.

e2fs_maxcontig was modelled after UFS when bringing the
"Orlov allocator" to ext2. On UFS fs_maxcontig is kept in the
superblock and is used by userland tools (fsck and growfs),

In ext2 this information is volatile so it is not available
for userland tools, so in this case it doesn't have sense
to carry it in the in-memory superblock.

Also remove a pointless check for MAX(1, x) > 0.

Submitted by: Christoph Mallon
MFC after: 2 weeks


# 246258 02-Feb-2013 pfg

ext2fs: general cleanup.

- Remove unused extern declarations in fs.h
- Correct comments in ext2_dir.h
- Several panic() messages showed wrong function names.
- Remove commented out stray line in ext2_alloc.c.
- Remove the unused macro EXT2_BLOCK_SIZE_BITS() and the then
write-only member e2fs_blocksize_bits from struct m_ext2fs.
- Remove the unused macro EXT2_FIRST_INO() and the then write-only
member e2fs_first_inode from struct m_ext2fs.
- Remove EXT2_DESC_PER_BLOCK() and the member e2fs_descpb from
struct m_ext2fs.
- Remove the unused members e2fs_bmask, e2fs_dbpg and
e2fs_mount_opt from struct m_ext2fs
- Correct harmless off-by-one error for fspath in ext2_vfsops.c.
- Remove the unused and broken macros EXT2_ADDR_PER_BLOCK_BITS()
and EXT2_DESC_PER_BLOCK_BITS().
- Remove the !_KERNEL versions of the EXT2_* macros.

Submitted by: Christoph Mallon
MFC after: 2 weeks


# 244475 20-Dec-2012 pfg

More constant renaming in preparation for newer features.

We also try to make better use of the fs flags instead of
trying adapt the code according to the fs structures. In
the case of subsecond timestamps and birthtime we now
check that the feature is explicitly enabled: previously
we only checked that the reserved space was available and
silently wrote them.

This approach is much safer, especially if the filesystem
happens to use embedded inodes or support EAs.

Discussed with: Zheng Liu
MFC after: 5 days


# 243311 19-Nov-2012 attilio

r16312 is not any longer real since many years (likely since when VFS
received granular locking) but the comment present in UFS has been
copied all over other filesystems code incorrectly for several times.

Removes comments that makes no sense now.

Reviewed by: kib
MFC after: 3 days


# 242833 09-Nov-2012 attilio

Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag.
Porters should refer to __FreeBSD_version 1000021 for this change as
it may have happened at the same timeframe.


# 238697 22-Jul-2012 kevlo

Use NULL instead of 0 for pointers


# 238059 03-Jul-2012 kevlo

Fix a typo


# 235241 10-May-2012 pluknet

Fix mount interlock oversights from the previous change in r234386.

Reported by: dougb
Submitted by: Mateusz Guzik <mjguzik at gmail com>
Reviewed by: Kirk McKusick
Tested by: pho


# 234386 17-Apr-2012 mckusick

Replace the MNT_VNODE_FOREACH interface with MNT_VNODE_FOREACH_ALL.
The primary changes are that the user of the interface no longer
needs to manage the mount-mutex locking and that the vnode that
is returned has its mutex locked (thus avoiding the need to check
to see if its is DOOMED or other possible end of life senarios).

To minimize compatibility issues for third-party developers, the
old MNT_VNODE_FOREACH interface will remain available so that this
change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH
will be removed in head.

The reason for this update is to prepare for the addition of the
MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the
active vnodes associated with a mount point (typically less than
1% of the vnodes associated with the mount point).

Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks


# 232703 08-Mar-2012 pfg

Add support for ns timestamps and birthtime to the ext2/3 driver.

When using big inodes there is sufficient space in ext3 to
keep extra resolution and birthtime (creation) timestamps.
The appropriate fields in the on-disk inode have been approved
for a long time but support for this in ext3 has not been
widely distributed.

In preparation for ext4 most linux distributions have enabled
by default such bigger inodes and some people use nanosecond
timestamps in ext3. We now support those when the inode is big
enough and while we do recognize the EXT4F_ROCOMPAT_EXTRA_ISIZE,
we maintain the extra timestamps even when they are not used.

An additional note by Bruce Evans:
We blindly accept unrepresentable tv_nsec in VOP_SETATTR(), but
all file systems have always done that. When POSIX gets around
to specifying the behaviour, it will probably require certain
rounding to the fs's resolution and not rejecting the request.
This unfortunately means that syscalls that set times can't
really tell if they succeeded without reading back the times
using stat() or similar and checking that they were set close
enough.

Reviewed by: bde
Approved by: jhb (mentor)
MFC after: 2 weeks


# 228583 16-Dec-2011 pfg

Style cleanups by jh@.
Fix a comment from the previous commit.
Use M_ZERO instead of bzero() in ext2_vfsops.c
Add include guards from PR.

PR: 162564
Approved by: jhb (mentor)
MFC after: 2 weeks


# 228539 15-Dec-2011 pfg

Bring in reallocblk to ext2fs.

The feature has been standard for a while in UFS as a means to reduce
fragmentation, therefore maintaining consistent performance with
filesystem aging. This is also very similar to what ext4 calls
"delayed allocation".

In his 2010 GSoC, Zheng Liu ported and benchmarked the missing
FANCY_REALLOC code to find more consistent performance improvements than
with the preallocation approach.

PR: 159233
Author: Zheng Liu <gnehzuil AT SPAMFREE gmail DOT com>
Sponsored by: Google Inc.
Approved by: jhb (mentor)
MFC after: 2 weeks


# 222167 21-May-2011 rmacklem

Add a lock flags argument to the VFS_FHTOVP() file system
method, so that callers can indicate the minimum vnode
locking requirement. This will allow some file systems to choose
to return a LK_SHARED locked vnode when LK_SHARED is specified
for the flags argument. This patch only adds the flag. It
does not change any file system to use it and all callers
specify LK_EXCLUSIVE, so file system semantics are not changed.

Reviewed by: kib


# 221128 27-Apr-2011 jhb

Use a private EXT2_ROOTINO constant instead of redefining ROOTINO.

Submitted by: Pedro F. Giffuni giffunip at yahoo


# 218176 01-Feb-2011 jhb

Some cosmetic fixes and remove a duplicate constant.

Submitted by: Pedro F. Giffuni giffunip at yahoo


# 217702 21-Jan-2011 jhb

Restore support for the 'async' and 'sync' mount options lost when
switching to nmount(2). While here, sort the options.

PR: kern/153584
Submitted by: Pedro F. Giffuni giffunip at yahoo
MFC after: 1 week


# 217582 19-Jan-2011 jhb

Merge 118969 from UFS:
Eliminate the i_devvp field from the incore inodes, we can get the same
value from ip->i_ump->um_devvp.

Submitted by: Pedro F. Giffuni giffunip at yahoo
MFC after: 1 week


# 202283 14-Jan-2010 lulf

Bring in the ext2fs work done by Aditya Sarawgi during and after Google Summer
of Code 2009:

- BSDL block and inode allocation policies for ext2fs. This involves the use
FFS1 style block and inode allocation for ext2fs. Preallocation was removed
since it was GPL'd.
- Make ext2fs MPSAFE by introducing locks to per-mount datastructures.
- Fixes for kern/122047 PR.
- Various small bugfixes.
- Move out of gnu/ directory.

Sponsored by: Google Inc.
Submitted by: Aditya Sarawgi <sarawgi.aditya AT SPAMFREE gmail DOT com>


# 193628 07-Jun-2009 stas

- Outindent long printf lines instead of splitting them in the
middle of senetences. This also makes the code more consistent
with the corresponding FFS code.
- Use 2-space sentences breaks consistently.

Suggested by: bde


# 193382 03-Jun-2009 stas

- Style(9) improvements.
- Convert all K&R definitions to ANSI equialents.
- Retire bsd_malloc and bsd_free macros and
use malloc/free directly.
- Drop some unused debugging calls.

This commit brings no functional changes.


# 193377 03-Jun-2009 stas

- Sync our copies of ext2fs Linux headers to current Linux versions.
Minimize differencies between our ext2fs headers and relevant Linux
versions by using EXT2_SB macro to access the superblock fields. Most
of the differencies in access to these fields are now hidden inside
this macro.
- Rename the s_db_per_group field of ext2fs_sb_info to s_gdb_count
to reflect the similar change in Linux headers. New name also seem
to be more appropriate for this field.
- Use proper types for s_first_inode and s_inode_size in-core superblock
fields. Now they reflec types used in the on-disk superblock version.
- Add support for older filesystem revisions that doesn't have proper
s_first_ino and s_inode_size fields in the on-disk superblock. In these
cases predefined values for these fields are used.
- Add simple sanity checks for s_first_inode and s_inode_size correctness.

Reviewed by: bde (previous version)
MFC after: 2 weeks


# 191990 11-May-2009 attilio

Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS. Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled. Bump __FreeBSD_version in order to signal such
situation.


# 187397 18-Jan-2009 stas

- Eliminate warnings in debug print macros by explicitly converting all
field to unsigned long.


# 187396 18-Jan-2009 stas

- Whitespace fixes.
- s_bmask field doesn't exist.
- Use correct flags in debug printf.


# 187395 18-Jan-2009 stas

- Obtain inode sizes and location of the first inode based on the contents
of superblock rather than using hardcoded values. This fixes ext2fs on
filesystems with inode sized other than 128.

Submitted by: Alex Lyashkov <Alexey.Lyashkov@Sun.COM> (based on)
MFC after: 2 weeks


# 184554 02-Nov-2008 attilio

Improve VFS locking:
- Implement real draining for vfs consumers by not relying on the
mnt_lock and using instead a refcount in order to keep track of lock
requesters.
- Due to the change above, remove the mnt_lock lockmgr because it is now
useless.
- Due to the change above, vfs_busy() is no more linked to a lockmgr.
Change so its KPI by removing the interlock argument and defining 2 new
flags for it: MBF_NOWAIT which basically replaces the LK_NOWAIT of the
old version (which was unlinked from the lockmgr alredy) and
MBF_MNTLSTLOCK which provides the ability to drop the mountlist_mtx
once the mnt interlock is held (ability still desired by most consumers).
- The stub used into vfs_mount_destroy(), that allows to override the
mnt_ref if running for more than 3 seconds, make it totally useless.
Remove it as it was thought to work into older versions.
If a problem of "refcount held never going away" should appear, we will
need to fix properly instead than trust on such hackish solution.
- Fix a bug where returning (with an error) from dounmount() was still
leaving the MNTK_MWAIT flag on even if it the waiters were actually
woken up. Just a place in vfs_mount_destroy() is left because it is
going to recycle the structure in any case, so it doesn't matter.
- Remove the markercnt refcount as it is useless.

This patch modifies VFS ABI and breaks KPI for vfs_busy() so manpages and
__FreeBSD_version will be modified accordingly.

Discussed with: kib
Tested by: pho


# 184413 28-Oct-2008 trasz

Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.

Approved by: rwatson (mentor)


# 184205 23-Oct-2008 des

Retire the MALLOC and FREE macros. They are an abomination unto style(9).

MFC after: 3 months


# 183754 10-Oct-2008 attilio

Remove the struct thread unuseful argument from bufobj interface.
In particular following functions KPI results modified:
- bufobj_invalbuf()
- bufsync()

and BO_SYNC() "virtual method" of the buffer objects set.
Main consumers of bufobj functions are affected by this change too and,
in particular, functions which changed their KPI are:
- vinvalbuf()
- g_vfs_close()

Due to the KPI breakage, __FreeBSD_version will be bumped in a later
commit.

As a side note, please consider just temporary the 'curthread' argument
passing to VOP_SYNC() (in bufsync()) as it will be axed out ASAP

Reviewed by: kib
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>


# 182542 31-Aug-2008 attilio

Decontextualize vfs_busy(), vfs_unbusy() and vfs_mount_alloc() functions.

Manpages are updated accordingly.

Tested by: Diego Sardina <siarodx at gmail dot com>


# 177645 26-Mar-2008 jhb

Fix a nit with the 'nofoo' options where 'foo' is mapped to 'nonofoo'
(such as 'atime' vs 'noatime'). The filesystems will always see either
'nofoo' or 'nonofoo', never plain 'foo'. As such, their list of valid
mount options should include 'nofoo' instead of 'foo'. With this fix,
you can do 'mount -u -o atime' on a FFS filesystem that isn't marked as
noatime without getting an error. You can also update a noatime FFS
filesystem mounted via mount(2) (e.g. 6.x /sbin/mount binary) to 'atime'
using nmount(2) (e.g. 7.x /sbin/mount binary).

MFC after: 1 week
Reviewed by: crodig


# 175635 24-Jan-2008 attilio

Cleanup lockmgr interface and exported KPI:
- Remove the "thread" argument from the lockmgr() function as it is
always curthread now
- Axe lockcount() function as it is no longer used
- Axe LOCKMGR_ASSERT() as it is bogus really and no currently used.
Hopefully this will be soonly replaced by something suitable for it.
- Remove the prototype for dumplockinfo() as the function is no longer
present

Addictionally:
- Introduce a KASSERT() in lockstatus() in order to let it accept only
curthread or NULL as they should only be passed
- Do a little bit of style(9) cleanup on lockmgr.h

KPI results heavilly broken by this change, so manpages and
FreeBSD_version will be modified accordingly by further commits.

Tested by: matteo


# 175294 13-Jan-2008 attilio

VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>


# 175202 09-Jan-2008 attilio

vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by: Diego Sardina <siarodx at gmail dot com>,
Andrea Di Pasquale <whyx dot it at gmail dot com>


# 173066 27-Oct-2007 rodrigc

Remove duplicate "union" from ext2_opts.

Noticed by: bde


# 172697 16-Oct-2007 alfred

Get rid of qaddr_t.

Requested by: bde


# 171852 15-Aug-2007 jhb

On 6.x this works:

% mount | grep home
/dev/ad4s1e on /home (ufs, local, noatime, soft-updates)
% mount -u -o atime /home
% mount | grep home
/dev/ad4s1e on /home (ufs, local, soft-updates)

Restore this behavior for on 7.x for the following mount options:
noatime, noclusterr, noclusterw, noexec, nosuid, nosymfollow

In addition, on 7.x, the following are equivalent:
mount -u -o atime /home
mount -u -o nonoatime /home

Ideally, when we introduce new mount options, we should avoid
options starting with "no". :)

Requested by: jhb
Reported by: Karol Kwiat <karol.kwiat gmail com>, Scott Hetzel <swhetzel gmail com>
Approved by: re (bmah)
Proxy commit for: rodrigc


# 171450 14-Jul-2007 rodrigc

The last entry in the ext2_opts array must be NULL,
otherwise the kernel with crash in vfs_filteropt() if an invalid
mount option is passed to ext2fs.

Approved by: re (kensmith)


# 167580 14-Mar-2007 rodrigc

Add "force" to ext2_ops, to match what was in the old mount_ext2fs binary.

Reported by: Ivan Voras <ivoras fer hr>


# 167497 12-Mar-2007 tegge

Make insmntque() externally visibile and allow it to fail (e.g. during
late stages of unmount). On failure, the vnode is recycled.

Add insmntque1(), to allow for file system specific cleanup when
recycling vnode on failure.

Change getnewvnode() to no longer call insmntque(). Previously,
embryonic vnodes were put onto the list of vnode belonging to a file
system, which is unsafe for a file system marked MPSAFE.

Change vfs_hash_insert() to no longer lock the vnode. The caller now
has that responsibility.

Change most file systems to lock the vnode and call insmntque() or
insmntque1() after a new vnode has been sufficiently setup. Handle
failed insmntque*() calls by propagating errors to callers, possibly
after some file system specific cleanup.

Approved by: re (kensmith)
Reviewed by: kib
In collaboration with: kib


# 166774 15-Feb-2007 pjd

Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method.
This way we may support multiple structures in v_data vnode field within
one file system without using black magic.

Vnode-to-file-handle should be VOP in the first place, but was made VFS
operation to keep interface as compatible as possible with SUN's VFS.
BTW. Now Solaris also implements vnode-to-file-handle as VOP operation.

VFS_VPTOFH() was left for API backward compatibility, but is marked for
removal before 8.0-RELEASE.

Approved by: mckusick
Discussed with: many (on IRC)
Tested with: ufs, msdosfs, cd9660, nullfs and zfs


# 164385 18-Nov-2006 rodrigc

Previously, the mount_ext2fs binary listed the acceptable mount
options for ext2fs. Now that we use nmount() directly from the mount
binary to access ext2fs filesystems, add the list of acceptable mount
options to ext2_ops, so that vfs_filteropts() will accept
options like "noatime" for ext2fs.

PR: 105483
Noticed by: Dr. Markus Waldeck <waldeck gmx de>
MFC after: 1 month


# 164033 06-Nov-2006 rwatson

Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges. These may
require some future tweaking.

Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>


# 162647 26-Sep-2006 tegge

Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag.
This eliminates a race where MNT_UPDATE flag could be lost when nmount()
raced against sync(), sync_fsync() or quotactl().


# 158924 25-May-2006 rodrigc

Remove calls to vfs_export() for exporting a filesystem for NFS mounting
from individual filesystems. Call it instead in vfs_mount.c,
after we call VFS_MOUNT() for a specific filesystem.


# 154152 09-Jan-2006 tegge

Add marker vnodes to ensure that all vnodes associated with the mount point are
iterated over when using MNT_VNODE_FOREACH.

Reviewed by: truckman


# 151897 31-Oct-2005 rwatson

Normalize a significant number of kernel malloc type names:

- Prefer '_' to ' ', as it results in more easily parsed results in
memory monitoring tools such as vmstat.

- Remove punctuation that is incompatible with using memory type names
as file names, such as '/' characters.

- Disambiguate some collisions by adding subsystem prefixes to some
memory types.

- Generally prefer lower case to upper case.

- If the same type is defined in multiple architecture directories,
attempt to use the same name in additional cases.

Not all instances were caught in this change, so more work is required to
finish this conversion. Similar changes are required for UMA zone names.


# 149960 10-Sep-2005 rodrigc

In ext2_mountfs(), check that the superblock size, SBSIZE,
is aligned with the sectorsize value returned by GEOM, before
doing a bread() of the superblock.
This eliminates a panic when trying the following on an empty CD-ROM drive:
mount_ext2fs /dev/acd0 /mnt

Reviewed by: phk


# 149771 03-Sep-2005 ssouhlal

Unbreak hpfs/ntfs/udf/ext2fs/reiserfs mounting.

Another pointyhat to: ssouhlal


# 149720 02-Sep-2005 ssouhlal

*_mountfs() (if the filesystem mounts from a device) needs devvp to be
locked, so lock it.

Glanced at by: phk
MFC after: 3 days


# 147408 16-Jun-2005 imp

Add standard GPL boilerplate to these files. They are the only ones
contaminated with the GPL code. While this information was present in
the COPYRIGHT.INFO file, it is FreeBSD's standard practice to, where
possible, include explicit license information in files.

Approved by: release engineer (scottl)


# 147393 15-Jun-2005 rodrigc

Move ext2fs from src/gnu to src/gnu/fs.
Discussed on arch@.

Reviewed by: kan
Approved by: re (blanket), kan


# 144059 24-Mar-2005 jeff

- Update vfs_root implementations to match the new prototype. None of
these filesystems will support shared locks until they are explicitly
modified to do so. Careful review must be done to ensure that this
is safe for each individual filesystem.

Sponsored by: Isilon Systems, Inc.


# 143692 16-Mar-2005 phk

Add two arguments to the vfs_hash() KPI so that filesystems which do
not have unique hashes (NFS) can also use it.


# 143677 16-Mar-2005 phk

Don't hold a reference to the disk vnode for each inode.

Don't store the disk cdev in all inodes, it's only used for debugging
printfs.


# 143663 15-Mar-2005 phk

Improve the vfs_hash() API: vput() the unneeded vnode centrally to
avoid replicating the vput in all the filesystems.


# 143619 15-Mar-2005 phk

Simplify the vfs_hash calling convention.


# 143578 14-Mar-2005 phk

Use vfs_hash() instead of home-rolled


# 143509 13-Mar-2005 jeff

- Catch up with ufs_inode 1.59, ffs_vfsops.c 1.280, and ufs_vnops.c 1.267.
Various changes to support new vgone() locking protocol.

Sponsored by: Isilon Systems, Inc.


# 140822 25-Jan-2005 phk

Introduce and use g_vfs_close().


# 140768 24-Jan-2005 phk

Create a vp->v_object in VFS_FHTOVP() if we want to be exportable
with NFS.

We are moving responsibility for creating the vnode_pager object into
the filesystems which own the vnode, and this is one of the places
we have to cover.

We call vnode_create_vobject() directly because we own the vnode.

If we can get the size easily, pass it as an argument to save the
call to VOP_GETATTR() in vnode_create_vobject()


# 140736 24-Jan-2005 phk

Remove unused cred argument to ext2_reload()


# 140220 14-Jan-2005 phk

Eliminate unused and unnecessary "cred" argument from vinvalbuf()


# 140048 11-Jan-2005 phk

Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC().

I'm not sure why a credential was added to these in the first place, it is
not used anywhere and it doesn't make much sense:

The credentials for syncing a file (ability to write to the
file) should be checked at the system call level.

Credentials for syncing one or more filesystems ("none")
should be checked at the system call level as well.

If the filesystem implementation needs a particular credential
to carry out the syncing it would logically have to the
cached mount credential, or a credential cached along with
any delayed write data.

Discussed with: rwatson


# 139778 06-Jan-2005 imp

/* -> /*- for copyright notices, minor format tweaks as necessary


# 138493 06-Dec-2004 phk

Convert to nmount. Add omount compat code.


# 138412 05-Dec-2004 phk

VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases
doesn't. Most of the implementations have grown weeds for this so they
copy some fields from mnt_stat if the passed argument isn't that.

Fix this the cleaner way: Always call the implementation on mnt_stat
and copy that in toto to the VFS_STATFS argument if different.


# 138368 04-Dec-2004 phk

Remove #if 0'ed rootfs mounting code.


# 138290 01-Dec-2004 phk

Back when VOP_* was introduced, we did not have new-style struct
initializations but we did have lofty goals and big ideals.

Adjust to more contemporary circumstances and gain type checking.

Replace the entire vop_t frobbing thing with properly typed
structures. The only casualty is that we can not add a new
VOP_ method with a loadable module. History has not given
us reason to belive this would ever be feasible in the the
first place.

Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.

Give coda correct prototypes and function definitions for
all vop_()s.

Generate a bit more data from the vnode_if.src file: a
struct vop_vector and protype typedefs for all vop methods.

Add a new vop_bypass() and make vop_default be a pointer
to another struct vop_vector.

Remove a lot of vfs_init since vop_vector is ready to use
from the compiler.

Cast various vop_mumble() to void * with uppercase name,
for instance VOP_PANIC, VOP_NULL etc.

Implement VCALL() by making vdesc_offset the offsetof() the
relevant function pointer in vop_vector. This is disgusting
but since the code is generated by a script comparatively
safe. The alternative for nullfs etc. would be much worse.

Fix up all vnode method vectors to remove casts so they
become typesafe. (The bulk of this is generated by scripts)


# 137321 06-Nov-2004 phk

Get even closer to not crashing ext2fs


# 137320 06-Nov-2004 phk

Get closer to unbreaking ext2fs


# 137039 29-Oct-2004 phk

Move EXT2FS to GEOM backing instead of DEVFS.

For details, please see src/sys/ufs/ffs/ffs_vfsops.c 1.250.


# 137008 28-Oct-2004 phk

Reduce the locking activity by epsilon by checking VNON condition before
releasing the mountlock.


# 136943 25-Oct-2004 phk

Loose the v_dirty* and v_clean* alias macros.

Check the count field where we just want to know the full/empty state,
rather than using TAILQ_EMPTY() or TAILQ_FIRST().


# 135864 27-Sep-2004 phk

Desupport device nodes on EXT2 filesystems.


# 132902 30-Jul-2004 phk

Put a version element in the VFS filesystem configuration structure
and refuse initializing filesystems with a wrong version. This will
aid maintenance activites on the 5-stable branch.

s/vfs_mount/vfs_omount/

s/vfs_nmount/vfs_mount/

Name our filesystems mount function consistently.

Eliminate the namiedata argument to both vfs_mount and vfs_omount.
It was originally there to save stack space. A few places abused
it to get hold of some credentials to pass around. Effectively
it is unused.

Reorganize the root filesystem selection code.


# 132805 28-Jul-2004 phk

Remove global variable rootdevs and rootvp, they are unused as such.

Add local rootvp variables as needed.

Remove checks for miniroot's in the swappartition. We never did that
and most of the filesystems could never be used for that, but it had
still been copy&pasted all over the place.


# 132023 12-Jul-2004 alfred

Make VFS_ROOT() and vflush() take a thread argument.
This is to allow filesystems to decide based on the passed thread
which vnode to return.
Several filesystems used curthread, they now use the passed thread.


# 131551 04-Jul-2004 phk

When we traverse the vnodes on a mountpoint we need to look out for
our cached 'next vnode' being removed from this mountpoint. If we
find that it was recycled, we restart our traversal from the start
of the list.

Code to do that is in all local disk filesystems (and a few other
places) and looks roughly like this:

MNT_ILOCK(mp);
loop:
for (vp = TAILQ_FIRST(&mp...);
(vp = nvp) != NULL;
nvp = TAILQ_NEXT(vp,...)) {
if (vp->v_mount != mp)
goto loop;
MNT_IUNLOCK(mp);
...
MNT_ILOCK(mp);
}
MNT_IUNLOCK(mp);

The code which takes vnodes off a mountpoint looks like this:

MNT_ILOCK(vp->v_mount);
...
TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes);
...
MNT_IUNLOCK(vp->v_mount);
...
vp->v_mount = something;

(Take a moment and try to spot the locking error before you read on.)

On a SMP system, one CPU could have removed nvp from our mountlist
but not yet gotten to assign a new value to vp->v_mount while another
CPU simultaneously get to the top of the traversal loop where it
finds that (vp->v_mount != mp) is not true despite the fact that
the vnode has indeed been removed from our mountpoint.

Fix:

Introduce the macro MNT_VNODE_FOREACH() to traverse the list of
vnodes on a mountpoint while taking into account that vnodes may
be removed from the list as we go. This saves approx 65 lines of
duplicated code.

Split the insmntque() which potentially moves a vnode from one mount
point to another into delmntque() and insmntque() which does just
what the names say.

Fix delmntque() to set vp->v_mount to NULL while holding the
mountpoint lock.


# 130585 16-Jun-2004 phk

Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.


# 128019 07-Apr-2004 imp

Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and email from Peter Wemm,
Alan Cox and Robert Watson.

Approved by: core, peter, alc, rwatson


# 126852 11-Mar-2004 phk

Remove unused mnt_reservedvnlist field.


# 125991 19-Feb-2004 tjr

Enforce the file size limit in VOP_WRITE() as well as VOP_TRUNCATE();
pointed out by bde.


# 125842 15-Feb-2004 bde

Removed support for the unsupported option READONLY. It just forced
dishonoring of requests for read-write mounts.


# 125786 13-Feb-2004 bde

MFffs (ffs_vfsops.c 1.76 (part of the big soft updates commit): lock
the vnode around calls to vinvalbuf()). Apparently no one has tested
ext2fs with DEBUG_VOP_LOCKS. Vnode locking for vinvalbuf() might not
be required in non-soft-updates cases, but it is now asserted.

MFffs (uncommitted related and nearby cleanups: don't unlock the vnode
after vinvalbuf() only to have to relock it almost immediately; don't
refer to devices classified by vn_isdisk() as block devices).


# 125739 12-Feb-2004 bde

MFffs (ffs_vfsops.c 1.227: clean up open mode bandaid). This reduces
gratuitous differences with ffs a little.


# 124908 24-Jan-2004 tjr

Copy workaround from FFS: open device for write access even if
the user requests a read-only mount. This is necessary because we
don't do the VOP_OPEN again if they upgrade a read-only mount to
read-write.

Noticed by: bde


# 122114 05-Nov-2003 bde

Fixed a reference to a nonexistent variable in previous commit. Renaming
of ffs_reload()'s mountp parameter to mp in rev.1.28 of ffs_vnops.c
had not been merged here.

ext2fs_reload() is still missing locking from not merging other changes
to ffs_reload(), but none of these is related to recent locking changes.


# 122091 05-Nov-2003 kan

Remove mntvnode_mtx and replace it with per-mountpoint mutex.
Introduce two new macros MNT_ILOCK(mp)/MNT_IUNLOCK(mp) to
operate on this mutex transparently.

Eventually new mutex will be protecting more fields in
struct mount, not only vnode list.

Discussed with: jeff


# 121925 03-Nov-2003 kan

Use VOP_UNLOCK/vrele instead of vput. td was erecived as a parameter
and one cannot be sure it is equal to curthread.


# 121874 02-Nov-2003 kan

Take care not to call vput if thread used in corresponding vget
wasn't curthread, i.e. when we receive a thread pointer to use
as a function argument. Use VOP_UNLOCK/vrele in these cases.

The only case there td != curthread known at the moment is
boot() calling sync with thread0 pointer.

This fixes the panic on shutdown people have reported.


# 121847 01-Nov-2003 kan

Temporarily undo parts of the stuct mount locking commit by jeff.
It is unsafe to hold a mutex across vput/vrele calls.

This will be redone when a better locking strategy is agreed upon.

Discussed with: jeff


# 120783 05-Oct-2003 jeff

- File systems that wish to inspect the vnode contents or their private
v_data field before calling vget/vn_lock must check VI_XLOCK manually to
be sure that v_data is still valid. Implement this check in two places
here.


# 120751 04-Oct-2003 jeff

- Don't use vrecycle() call vgonel() directly after grabing the vnode
interlock. We do this so that we still hold the interlock when we lock
the vnode later. This prevents races with the mnt vnode list.


# 119513 27-Aug-2003 jeff

- Clean-up comments that refer to the use of B_LOCKED.


# 118047 26-Jul-2003 phk

Add a "int fd" argument to VOP_OPEN() which in the future will
contain the filedescriptor number on opens from userland.

The index is used rather than a "struct file *" since it conveys a bit
more information, which may be useful to in particular fdescfs and /dev/fd/*

For now pass -1 all over the place.


# 116271 12-Jun-2003 phk

Initialize struct vfsops C99-sparsely.

Submitted by: hmp
Reviewed by: phk


# 111856 03-Mar-2003 jeff

- Add a new 'flags' parameter to getblk().
- Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT
flag to the initial BUF_LOCK(). This will eventually be used in cases
were we want to use a buffer only if it is not currently in use.
- Convert all consumers of the getblk() api to use this extra parameter.

Reviwed by: arch
Not objected to by: mckusick


# 111119 19-Feb-2003 imp

Back out M_* changes, per decision of the TRB.

Approved by: trb


# 109623 21-Jan-2003 alfred

Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.


# 108533 01-Jan-2003 schweikh

Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup,
especially in troff files.


# 105077 14-Oct-2002 mckusick

Regularize the vop_stdlock'ing protocol across all the filesystems
that use it. Specifically, vop_stdlock uses the lock pointed to by
vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to
reference vp->v_lock. Filesystems that wish to use the default
do not need to allocate a lock at the front of their node structure
(as some still did) or do a lockinit. They can simply start using
vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks,
but still use the vop_stdlock functions (such as nullfs) can simply
replace vp->v_vnlock with a pointer to the lock that they wish to
have used for the vnode. Such filesystems are responsible for
setting the vp->v_vnlock back to the default in their vop_reclaim
routine (e.g., vp->v_vnlock = &vp->v_lock).

In theory, this set of changes cleans up the existing filesystem
lock interface and should have no function change to the existing
locking scheme.

Sponsored by: DARPA & NAI Labs.


# 103938 25-Sep-2002 jeff

- Lock access to the buf lists.
- Use vrefcnt() where appropriate.


# 103314 14-Sep-2002 njl

Remove all use of vnode->v_tag, replacing with appropriate substitutes.
v_tag is now const char * and should only be used for debugging.

Additionally:
1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK
2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which
is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP.

Suggested by: phk
Reviewed by: bde, rwatson (earlier version)


# 97255 24-May-2002 mux

Convert ext2fs to nmount(2).


# 96881 18-May-2002 iedowse

Add an ext2_uninit() routine that undoes the actions performed by
ext2_init(). This permits the ext2fs module to be unloaded without
causing panics and leaking memory.


# 96880 18-May-2002 iedowse

Fix two off-by-one errors when sanity-checking inode numbers. In
ext2fs, inode numbers start at 1, so the maximum valid inode number
is (s_inodes_per_group * s_groups_count), not one less. This is
just a minimal change to avoid unnecessary panics and errors; some
other related bugs that Bruce Evans mentioned to me are not addressed.

Reviewed by: bde (ages ago)


# 96752 16-May-2002 iedowse

Remove register keyword.


# 96749 16-May-2002 iedowse

Complete the separation of ext2fs from ufs by copying the remaining
shared code and converting all ufs references. Originally it may
have made sense to share common features between the two filesystems,
but recently it has only caused problems, the UFS2 work being the
final straw.

All UFS_* indirect calls are now direct calls to ext2_* functions,
and ext2fs-specific mount and inode structures have been introduced.


# 93593 01-Apr-2002 jhb

Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on: smp@


# 93430 30-Mar-2002 bde

In ffs_mountffs(), set mnt_iosize_max to si_iosize_max unconditionally
provided the latter is nonzero. At this point, the former is a fairly
arbitrary default value (DFTPHYS), so changing it to any reasonable
value specified by the device driver is safe. Using the maximum of
these limits broke ffs clustered i/o for devices whose si_iosize_max
is < DFLTPHYS. Using the minimum would break device drivers' ability
to increase the active limit from DFTLPHYS up to MAXPHYS.

Copied the code for this and the associated (unnecessary?) fixup of
mp_iosize_max to all other filesystems that use clustering (ext2fs and
msdosfs). It was completely missing.

PR: 36309
MFC-after: 1 week


# 93016 23-Mar-2002 bde

Moved $FreeBSD$ to the correct place.


# 93014 23-Mar-2002 bde

Fixed some style bugs in the removal of __P(()). Continuation lines
were not outdented to preserve non-KNF lining up of code with parentheses.
Switch to KNF formatting.


# 92728 19-Mar-2002 alfred

Remove __P.


# 92462 16-Mar-2002 mckusick

Add a flags parameter to VFS_VGET to pass through the desired
locking flags when acquiring a vnode. The immediate purpose is
to allow polling lock requests (LK_NOWAIT) needed by soft updates
to avoid deadlock when enlisting other processes to help with
the background cleanup. For the future it will allow the use of
shared locks for read access to vnodes. This change touches a
lot of files as it affects most filesystems within the system.
It has been well tested on FFS, loopback, and CD-ROM filesystems.
only lightly on the others, so if you find a problem there, please
let me (mckusick@mckusick.com) know.


# 92082 11-Mar-2002 phk

Remove use of the bogus ioctl DIOCGPART.

It was used to initialize an unused variable, because ext2fs was
copy&pasted from UFS rather than copy,paste&cleaned from UFS.

Suggested by: bde


# 91406 27-Feb-2002 jhb

Simple p_ucred -> td_ucred changes to start using the per-thread ucred
reference.


# 86037 04-Nov-2001 dillon

Add mnt_reservedvnlist so we can MFC to 4.x, in order to make all mount
structure changes now rather then piecemeal later on. mnt_nvnodelist
currently holds all the vnodes under the mount point. This will eventually
be split into a 'dirty' and 'clean' list. This way we only break kld's once
rather then twice. nvnodelist will eventually turn into the dirty list
and should remain compatible with the klds.


# 85339 22-Oct-2001 dillon

Change the vnode list under the mount point from a LIST to a TAILQ
in preparation for an implementation of limiting code for kern.maxvnodes.

MFC after: 3 days


# 83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 78940 28-Jun-2001 jhb

Fix more mntvnode and vnode interlock order reversals.


# 78907 28-Jun-2001 jhb

Fix a mntvnode and vnode interlock reversal.


# 76688 16-May-2001 iedowse

Change the second argument of vflush() to an integer that specifies
the number of references on the filesystem root vnode to be both
expected and released. Many filesystems hold an extra reference on
the filesystem root vnode, which must be accounted for when
determining if the filesystem is busy and then released if it isn't
busy. The old `skipvp' approach required individual filesystem
xxx_unmount functions to re-implement much of vflush()'s logic to
deal with the root vnode.

All 9 filesystems that hold an extra reference on the root vnode
got the logic wrong in the case of forced unmounts, so `umount -f'
would always fail if there were any extra root vnode references.
Fix this issue centrally in vflush(), now that we can.

This commit also fixes a vnode reference leak in devfs, which could
result in idle devfs filesystems that refuse to unmount.

Reviewed by: phk, bp


# 75934 25-Apr-2001 phk

Move the netexport structure from the fs-specific mountstructure
to struct mount.

This makes the "struct netexport *" paramter to the vfs_export
and vfs_checkexport interface unneeded.

Consequently that all non-stacking filesystems can use
vfs_stdcheckexp().

At the same time, make it a pointer to a struct netexport
in struct mount, so that we can remove the bogus AF_MAX
and #include <net/radix.h> from <sys/mount.h>


# 73286 01-Mar-2001 adrian

Reviewed by: jlemon

An initial tidyup of the mount() syscall and VFS mount code.

This code replaces the earlier work done by jlemon in an attempt to
make linux_mount() work.

* the guts of the mount work has been moved into vfs_mount().

* move `type', `path' and `flags' from being userland variables into being
kernel variables in vfs_mount(). `data' remains a pointer into
userspace.

* Attempt to verify the `type' and `path' strings passed to vfs_mount()
aren't too long.

* rework mount() and linux_mount() to take the userland parameters
(besides data, as mentioned) and pass kernel variables to vfs_mount().
(linux_mount() already did this, I've just tidied it up a little more.)

* remove the copyin*() stuff for `path'. `data' still requires copyin*()
since its a pointer into userland.

* set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each
filesystem. This variable is generally initialised with `path', and
each filesystem can override it if they want to.

* NOTE: f_mntonname is intiailised with "/" in the case of a root mount.


# 72200 09-Feb-2001 bmilekic

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


# 71999 04-Feb-2001 phk

Mechanical change to use <sys/queue.h> macro API instead of
fondling implementation details.

Created with: sed(1)
Reviewed by: md5(1)


# 71699 26-Jan-2001 jhb

Back out proc locking to protect p_ucred for obtaining additional
references along with the actual obtaining of additional references.


# 71576 24-Jan-2001 jasone

Convert all simplelocks to mutexes and remove the simplelock implementations.


# 71483 23-Jan-2001 jhb

Proc locking, mostly protecting p_ucred while obtaining additional
references.


# 69806 09-Dec-2000 mjacob

Use a pointer to a size_t for the 4th argument to copyinstr-
not a pointer to a u_int.


# 68291 03-Nov-2000 bde

Support filesystems with the not-so-new "sparse_superblocks" feature.
When this feature is enabled, mke2fs doesn't necessarily allocate a
super block and its associated descriptor blocks for every group.
The (non-)allocations are reflected in the block bitmap. Since the
filesystem code doesn't write to these blocks except for the first
superblock, all it has to do to support them is to not count them in
ext2_statfs() and not attempt to check them at mount time in
ext2_check_blocks_bitmap() (the check has never been enabled in
FreeBSD anyway).


# 66886 09-Oct-2000 eivind

Blow away the v_specmountpoint define, replacing it with what it was
defined as (rdev->si_mountpoint)


# 66615 03-Oct-2000 jasone

Convert lockmgr locks from using simple locks to using mutexes.

Add lockdestroy() and appropriate invocations, which corresponds to
lockinit() and must be called to clean up after a lockmgr lock is no
longer needed.


# 66378 25-Sep-2000 bp

ext2fs depends on ufs code, so update it to properly handle v_lock field.

Noticed by: bde


# 60041 05-May-2000 phk

Separate the struct bio related stuff out of <sys/buf.h> into
<sys/bio.h>.

<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall
not be made a nested include according to bdes teachings on the
subject of nested includes.

Diskdrivers and similar stuff below specfs::strategy() should no
longer need to include <sys/buf.> unless they need caching of data.

Still a few bogus uses of struct buf to track down.

Repocopy by: peter


# 59794 30-Apr-2000 phk

Remove unneeded #include <vm/vm_zone.h>

Generated by: src/tools/tools/kerninclude


# 59259 15-Apr-2000 rwatson

ext2fs relies on UFS support code, and as a result also requires
extattr.h to be included. This fixes the broken ext2fs build as of
the import of extattr code.

Also added $FreeBSD: $ to a couple of files that didn't have them,
without which I couldn't commit this fix.

Reported by: "George W. Dinolt" <gdinolt@pacbell.net>


# 57839 09-Mar-2000 bde

Don't forget to check for unsupported features when updating. It was
possible to defeat the check for rw incompatibilty by mounting ro and
updating to rw.

Approved by: jkh


# 55756 10-Jan-2000 phk

Give vn_isdisk() a second argument where it can return a suitable errno.

Suggested by: bde


# 55313 02-Jan-2000 bde

Don't allow mounting (or mounting R/W) of filesystems with unsupported
features (except for file types in directory entries, which will be
supported soon).

Centralized the magic number and compatibility checking.

Dropped support for ancient (pre-0.2b) filesystems, as in the Linux
version. Our "support" consisted of printing more details in the error
message before failing at mount time.


# 54803 19-Dec-1999 rwatson

Second pass commit to introduce new ACL and Extended Attribute system
calls, vnops, vfsops, both in /kern, and to individual file systems that
require a vfsop_ array entry.

Reviewed by: eivind


# 54655 15-Dec-1999 eivind

Introduce NDFREE (and remove VOP_ABORTOP)


# 53452 20-Nov-1999 phk

struct mountlist and struct mount.mnt_list have no business being
a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively.

This removes ugly mp != (void*)&mountlist comparisons.

Requested by: phk
Submitted by: Jake Burkholder jake@checker.org
PR: 14967


# 53059 09-Nov-1999 phk

Next step in the device cleanup process.

Correctly lock vnodes when calling VOP_OPEN() from filesystem mount code.

Unify spec_open() for bdev and cdev cases.

Remove the disabled bdev specific read/write code.


# 52782 01-Nov-1999 msmith

Newline-terminate the complaint message about not being able to find
the root vnode pointer.


# 51808 30-Sep-1999 phk

Remove the D_NOCLUSTER[RW] options which were added because vn had
problems. Now that Matt has fixed vn, this can go. The vn driver
should have used d_maxio (now si_iosize_max) anyway.


# 51138 10-Sep-1999 alfred

Seperate the export check in VFS_FHTOVP, exports are now checked via
VFS_CHECKEXP.

Add fh(open|stat|stafs) syscalls to allow userland to query filesystems
based on (network) filehandle.

Obtained from: NetBSD


# 50347 25-Aug-1999 phk

Introduce vn_isdisk(struct vnode *vp) function, and use it to test for diskness.


# 50256 23-Aug-1999 bde

Initialise fsids with (user) device numbers again. Bitrot when dev_t's
were changed to pointers was obscured by casting dev_t's to longs.
fsids haven't even been comprised of longs since the Lite2 merge.


# 49679 13-Aug-1999 phk

The bdevsw() and cdevsw() are now identical, so kill the former.


# 49535 08-Aug-1999 phk

Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>,
a few lines into <sys/vnode.h>.

Add a few fields to struct specinfo, paving the way for the fun part.


# 47640 31-May-1999 phk

Simplify cdevsw registration.

The cdevsw_add() function now finds the major number(s) in the
struct cdevsw passed to it. cdevsw_add_generic() is no longer
needed, cdevsw_add() does the same thing.

cdevsw_add() will print an message if the d_maj field looks bogus.

Remove nblkdev and nchrdev variables. Most places they were used
bogusly. Instead check a dev_t for validity by seeing if devsw()
or bdevsw() returns NULL.

Move bdevsw() and devsw() functions to kern/kern_conf.c

Bump __FreeBSD_version to 400006

This commit removes:
72 bogus makedev() calls
26 bogus SYSINIT functions

if_xe.c bogusly accessed cdevsw[], author/maintainer please fix.

I4b and vinum not changed. Patches emailed to authors. LINT
probably broken until they catch up.


# 46676 08-May-1999 phk

I got tired of seeing all the cdevsw[major(foo)] all over the place.

Made a new (inline) function devsw(dev_t dev) and substituted it.

Changed to the BDEV variant to this format as well: bdevsw(dev_t dev)

DEVFS will eventually benefit from this change too.


# 46635 07-May-1999 phk

Continue where Julian left off in July 1998:

Virtualize bdevsw[] from cdevsw. bdevsw() is now an (inline)
function.

Join CDEV_MODULE and BDEV_MODULE to DEV_MODULE (please pay attention
to the order of the cmaj/bmaj arguments!)

Join CDEV_DRIVER_MODULE and BDEV_DRIVER_MODULE to DEV_DRIVER_MODULE
(ditto!)

(Next step will be to convert all bdev dev_t's to cdev dev_t's
before they get to do any damage^H^H^H^H^H^Hwork in the kernel.)


# 43395 29-Jan-1999 bde

Fixed parenthesization botch in previous commit. Async update of inodes
was broken.


# 43309 27-Jan-1999 dillon

Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile.

This commit includes significant work to proper handle const arguments
for the DDB symbol routines.


# 43301 27-Jan-1999 dillon

Fix warnings in preparation for adding -Wall -Wcast-qual to the
kernel compile


# 41591 07-Dec-1998 archie

The "easy" fixes for compiling the kernel -Wunused: remove unreferenced static
and local variables, goto labels, and functions declared but not defined.


# 40790 31-Oct-1998 peter

Use TAILQ macros for clean/dirty block list processing. Set b_xflags
rather than abusing the list next pointer with a magic number.


# 40651 25-Oct-1998 bde

Don't follow null bdevsw pointers. The `major(dev) < nblkdev' test rotted
when bdevsw[] became sparse. We still depend on magic to avoid having to
check that (v_rdev) device numbers in vnodes are not NODEV.


# 39678 26-Sep-1998 bde

Updated ext2_reload() and ext2_sync(). Locking was broken, and MNT_LAZY
syncs weren't optimized properly (they probably still aren't, but are bug
for bug compatible with ffs). These fixes are mostly academic, since
ext2fs is too broken to mount read-write (it apparently doesn't clear
indirect blocks).

Obtained from: mostly from Lite2


# 39671 26-Sep-1998 bde

Fixed missing newlines in messages in ext2_check_descriptors().

Fixed vnode and memory leaks after an unlikely (?) error in
ext2_mountfs().

Fixed an unconditional memory leak in ext2_unmount().


# 39670 26-Sep-1998 bde

Fixed clean flag handling:

Fixes for bugs not shared with ffs:
- don't mount unclean filesystems rw unless forced to.
- accept EXT2_ERROR_FS (treat it like !EXT2_VALID_FS). We still don't set
this or honour the maximal mount count.
- don't attempt to print the name of the mount point when mounting an
unclean file system, since the name of the previous mount point is
unknown and the name of the current mount point is still "".

Fixes for bugs shared with ffs until recently:
- don't set the clean flag on unmount of an initially-unclean filesystem
that was (forcibly) mounted rw.
- set the clean flag on rw -> ro update of a mounted initially-clean
filesystem.
- fixed some style bugs (mostly long lines).

The fixes are slightly simpler than for ffs, because the relevant on-disk
state is not a simple boolean variable, and the superblock has a core-only
extension.

Obtained from: parts from ffs_vfsops.c, parts from NetBSD


# 39028 09-Sep-1998 bde

Fixed the usual missing permissions checks in mount(). As for cd9660,
the damage was limited by the default of 0 for vfs.usermount.

Obtained from: Lite2 via the -current ffs_vfsops.c


# 38998 09-Sep-1998 bde

Don't forget to initialize the inode lock. This bug caused
surprisingly few problems. Most fields were initialized to the
correct values by bzero(), but lk_prio was 0 instead of PINOD (=8),
the lk_wmsg was NULL instead of "ext2in", and lk_lockholder was 0
instead of -1.

Obtained from: Lite2 via the -current ffs_vfsops.c


# 38909 07-Sep-1998 bde

Removed statically configured mount type numbers (MOUNT_*) and all
references to them.

The change a couple of days ago to ignore these numbers in statically
configured vfsconf structs was slightly premature because the cd9660,
cfs, devfs, ext2fs, nfs vfs's still used MOUNT_* instead of the number
in their vfsconf struct.


# 37966 30-Jul-1998 julian

add anti-panic workaround from chris radek (cradek@in221.inetnebr.com)
Not sure why this is needed but but does stop crashes.


# 36102 16-May-1998 bde

Don't use "ffs" in an ext2fs sleep message string.

Don't forget to clear the inode hash lock before returning from ext2_vget()
after getnewvnode() fails. Obtained from: rev.1.24 of ffs_vfsops.c (the
original patch for the getnewvnode() race). Forgotten in: rev.1.4 here.

Removed a duplicate comment. Duplicated in: rev.1.4 here.

Fixed the MALLOC() vs getnewvnode() race in ext2_vget(). Obtained from:
rev.1.39 of ffs_vfsops.c.


# 35769 06-May-1998 msmith

As described by the submitter:

Reverse the VFS_VRELE patch. Reference counting of vnodes does not need
to be done per-fs. I noticed this while fixing vfs layering violations.
Doing reference counting in generic code is also the preference cited by
John Heidemann in recent discussions with him.

The implementation of alternative vnode management per-fs is still a valid
requirement for some filesystems but will be revisited sometime later,
most likely using a different framework.

Submitted by: Michael Hancock <michaelh@cet.co.jp>


# 34961 30-Mar-1998 phk

Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time. Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by: bde


# 34430 09-Mar-1998 eivind

Make this compile after soft updates integration.

LINTing forgotten by: julian


# 33964 01-Mar-1998 msmith

The intent is to get rid of WILLRELE in vnode_if.src by making
a complement to all ops that return a vpp, VFS_VRELE. This is
initially only for file systems that implement the following ops
that do a WILLRELE:

vop_create, vop_whiteout, vop_mknod, vop_remove, vop_link,
vop_rename, vop_mkdir, vop_rmdir, vop_symlink

This is initial DNA that doesn't do anything yet. VFS_VRELE is
implemented but not called.

A default vfs_vrele was created for fs implementations that use the
standard vnode management routines.

VFS_VRELE implementations were made for the following file systems:

Standard (vfs_vrele)
ffs mfs nfs msdosfs devfs ext2fs

Custom
union umapfs

Just EOPNOTSUPP
fdesc procfs kernfs portal cd9660

These implementations may change as VOP changes are implemented.

In the next phase, in the vop implementations calls to vrele and the vrele
part of vput will be moved to the top layer vfs_vnops and made visible
to all layers. vput will be replaced by unlock in these cases. Unlocking
will still be done in the per fs layer but the refcount decrement will be
triggered at the top because it doesn't hurt to hold a vnode reference a
little longer. This will have minimal impact on the structure of the
existing code.

This will only be done for vnode arguments that are released by the various
fs vop implementations.

Wider use of VFS_VRELE will likely require restructuring of the code.

Reviewed by: phk, dyson, terry et. al.
Submitted by: Michael Hancock <michaelh@cet.co.jp>


# 31485 02-Dec-1997 bde

Use the same algorithm as ffs for generation numbers.


# 31483 02-Dec-1997 bde

Removed __FreeBSD__ ifdefs.


# 31315 20-Nov-1997 bde

Use consistent description strings for M_EXT2NODE. This also fixes a
spelling error in the unused string.


# 31132 12-Nov-1997 julian

Reviewed by: various.

Ever since I first say the way the mount flags were used I've hated the
fact that modes, and events, internal and exported, and short-term
and long term flags are all thrown together. Finally it's annoyed me enough..
This patch to the entire FreeBSD tree adds a second mount flag word
to the mount struct. it is not exported to userspace. I have moved
some of the non exported flags over to this word. this means that we now
have 8 free bits in the mount flags. There are another two that might
well move over, but which I'm not sure about.
The only user visible change would have been in pstat -v, except
that davidg has disabled it anyhow.
I'd still like to move the state flags and the 'command' flags
apart from each other.. e.g. MNT_FORCE really doesn't have the
same semantics as MNT_RDONLY, but that's left for another day.


# 30492 16-Oct-1997 phk

Another VFS cleanup "kilo commit"

1. Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS}
intereface function, and now lives in the ufsmount structure.

2. Remove VOP_SEEK, it was unused.

3. Add mode default vops:

VOP_ADVLOCK vop_einval
VOP_CLOSE vop_null
VOP_FSYNC vop_null
VOP_IOCTL vop_enotty
VOP_MMAP vop_einval
VOP_OPEN vop_null
VOP_PATHCONF vop_einval
VOP_READLINK vop_einval
VOP_REALLOCBLKS vop_eopnotsupp

And remove identical functionality from filesystems

4. Add vop_stdpathconf, which returns the canonical stuff. Use
it in the filesystems. (XXX: It's probably wrong that specfs
and fifofs sets this vop, shouldn't it come from the "host"
filesystem, for instance ufs or cd9660 ?)

5. Try to make system wide VOP functions have vop_* names.

6. Initialize the um_* vectors in LFS.

(Recompile your LKMS!!!)


# 30474 16-Oct-1997 phk

VFS mega cleanup commit (x/N)

1. Add new file "sys/kern/vfs_default.c" where default actions for
VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE,
POLL, REVOKE and STRATEGY. Various stuff spread over the entire
tree belongs here.

2. Change VOP_BLKATOFF to a normal function in cd9660.

3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These
are private interface functions between UFS and the underlying
storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now
live in struct ufsmount instead.

4. Remove a kludge of VOP_ functions in all filesystems, that did
nothing but obscure the simplicity and break the expandability.
If a filesystem doesn't implement VOP_FOO, it shouldn't have an
entry for it in its vnops table. The system will try to DTRT
if it is not implemented. There are still some cruft left, but
the bulk of it is done.

5. Fix another VCALL in vfs_cache.c (thanks Bruce!)


# 30469 16-Oct-1997 julian

Two more places where root filesystems were mounted, put them at the head of
the mount list in case there is already DEVFS present.


# 30354 12-Oct-1997 phk

Last major round (Unless Bruce thinks of somthing :-) of malloc changes.

Distribute all but the most fundamental malloc types. This time I also
remembered the trick to making things static: Put "static" in front of
them.

A couple of finer points by: bde


# 30309 11-Oct-1997 phk

Distribute and statizice a lot of the malloc M_* types.

Substantial input from: bde


# 30280 10-Oct-1997 phk

Mega commit to cleanup the "remaining nits" after my malloc change.

Introduce a M_EXT2NODE for ext2fs vnodes.
Use generic ufs_reclaim instead of hijacking ffs_reclaim.


# 29906 28-Sep-1997 kato

Oops, include <sys/conf.h>.

Reminded-by: Simon Shapiro <Shimon@i-Connect.Net>


# 29888 27-Sep-1997 kato

Clustered read and write are switched at mount-option level.

1. Clustered I/O is switched by the MNT_NOCLUSTERR and MNT_NOCLUSTERW
bits of the mnt_flag. The sysctl variables, vfs.foo.doclusterread
and vfs.foo.doclusterwrite are deleted. Only mount option can
control clustered I/O from userland.
2. When foofs_mount mounts block device, foofs_mount checks D_CLUSTERR
and D_CLUSTERW bits of the d_flags member in the block device switch
table. If D_NOCLUSTERR / D_NOCLUSTERW are set, MNT_NOCLUSTERR /
MNT_NOCLUSTERW bits will be set. In this case, MNT_NOCLUSTERR and
MNT_NOCLUSTERW cannot be cleared from userland.
3. Vnode driver disables both clustered read and write.
4. Union filesystem disables clutered write.

Reviewed by: bde


# 29208 07-Sep-1997 bde

Removed yet more vestiges of config-time swap configuration and/or
cleaned up nearby cruft.


# 28270 16-Aug-1997 wollman

Fix all areas of the system (or at least all those in LINT) to avoid storing
socket addresses in mbufs. (Socket buffers are the one exception.) A number
of kernel APIs needed to get fixed in order to make this happen. Also,
fix three protocol families which kept PCBs in mbufs to not malloc them
instead. Delete some old compatibility cruft while we're at it, and add
some new routines in the in_cksum family.


# 27881 04-Aug-1997 dyson

Fix a problem with ext2fs so that filesystems mounted at reboot don't
keep ahold of buffers, and therefore leave filesystems dirty. I haven't
been able to test, but the code compiles. Those who run -current, please
test and report back!!! (Sorry :-)).

PR: kern/3571
Submitted by: Dirk Keunecke <dk@panda.rhein-main.de>


# 26641 14-Jun-1997 bde

Removed unused #includes.


# 24203 24-Mar-1997 bde

Don't include <sys/ioctl.h> in the kernel. Stage 1: don't include
it when it is not used. In most cases, the reasons for including it
went away when the special ioctl headers became self-sufficient.


# 24131 23-Mar-1997 bde

Don't #include <sys/fcntl.h> in <sys/file.h> if KERNEL is defined.
Fixed everything that depended on getting fcntl.h stuff from the wrong
place. Most things don't depend on file.h stuff at all.


# 22521 10-Feb-1997 dyson

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 16322 12-Jun-1996 gpalmer

Clean up -Wunused warnings.

Reviewed by: bde


# 14249 25-Feb-1996 bde

Removed vestigial support for the obsolete FIFO option. In ext2fs
it caused null pointer panics for all fifo operations unless FIFO
was defined.


# 13260 05-Jan-1996 wollman

Convert QUOTA to new-style option.


# 12911 17-Dec-1995 phk

Staticize.


# 12746 10-Dec-1995 bde

Restored variables that are used iff QUOTA is defined.

ext2fs still uses #if in many cases where the rest of the kernel uses
#ifdef (for QUOTA...).


# 12406 19-Nov-1995 dyson

Correct some serious porting errors. The worst one was that the
vnode was being placed upon the mount point twice!!!


# 12159 09-Nov-1995 bde

ext2_inode_cnv.c:
Included <sys/vnode.h> and its prerequisite <sys/proc.h>, and cleaned
up includes. The vop_t changes made the non-inclusion of vnode.h
fatal instead of just sloppy.

i386_bitops.h:
Changed `extern inline' to `static inline'. `extern inline' is a
Linuxism that stops things from compiling without -O. Fixed
idempotency identifier.

Misc:
Added prototypes. Staticized some functions so that prototypes are
unnecessary. Added casts. Cleaned up includes.


# 12147 08-Nov-1995 dyson

Cleaned up some lint and some obvious prototyping errors.


# 12115 05-Nov-1995 dyson

Main code for the ext2fs filesystem. Please refer to the COPYRIGHT.INFO
file for GPL restrictions. This code was ported to the BSD platform
by Godmar Back <gback@facility.cs.utah.edu> and specifically to FreeBSD
by John Dyson. This code is still green and should be used with caution.
Additional changes to UFS necessary to make this code work will be commited
seperately.
Submitted by: Godmar Back <gback@facility.cs.utah.edu>
Obtained from: Lites/Mach4