History log of /freebsd-9.3-release/sys/ufs/ufs/ufs_inode.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 267654 19-Jun-2014 gjb

Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 235626 18-May-2012 mckusick

MFC of 234386, 234400, 234441, 234443, 234482, 234483, 235052, 235241,
235246, and 235619

MFC: 234386

Replace the MNT_VNODE_FOREACH interface with MNT_VNODE_FOREACH_ALL.
The primary changes are that the user of the interface no longer
needs to manage the mount-mutex locking and that the vnode that
is returned has its mutex locked (thus avoiding the need to check
to see if its is DOOMED or other possible end of life senarios).

To minimize compatibility issues for third-party developers, the
old MNT_VNODE_FOREACH interface will remain available so that this
change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH
will be removed in head.

The reason for this update is to prepare for the addition of the
MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the
active vnodes associated with a mount point (typically less than
1% of the vnodes associated with the mount point).

Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks

MFC: 234400

Drop export of vdestroy() function from kern/vfs_subr.c as it is
used only as a helper function in that file. Replace sole call to
vbusy() with inline code in vholdl(). Replace sole calls to vfree()
and vdestroy() with inline code in vdropl().

The Clang compiler already inlines these functions, so they do not
show up in a kernel backtrace which is confusing. Also you cannot
set their frame in kgdb which means that it is impossible to view
their local variables. So, while the produced code is unchanged,
the debugging should be easier.

Discussed with: kib
MFC after: 2 weeks

MFC: 234441

Fix a memory leak of M_VNODE_MARKER introduced in 234386.

Found by: Peter Holm

MFC: 234443

Delete a no longer useful VNASSERT missed during changes in 234400.

Suggested by: kib

MFC: 234482

This change creates a new list of active vnodes associated with
a mount point. Active vnodes are those with a non-zero use or hold
count, e.g., those vnodes that are not on the free list. Note that
this list is in addition to the list of all the vnodes associated
with a mount point.

To avoid adding another set of linkage pointers to the vnode
structure, the active list uses the existing linkage pointers
used by the free list (previously named v_freelist, now renamed
v_actfreelist).

This update adds the MNT_VNODE_FOREACH_ACTIVE interface that loops
over just the active vnodes associated with a mount point (typically
less than 1% of the vnodes associated with the mount point).

Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks

MFC: 234483

This update uses the MNT_VNODE_FOREACH_ACTIVE interface that loops
over just the active vnodes associated with a mount point to replace
MNT_VNODE_FOREACH_ALL in the vfs_msync, ffs_sync_lazy, and qsync
routines.

The vfs_msync routine is run every 30 seconds for every writably
mounted filesystem. It ensures that any files mmap'ed from the
filesystem with modified pages have those pages queued to be
written back to the file from which they are mapped.

The ffs_lazy_sync and qsync routines are run every 30 seconds for
every writably mounted UFS/FFS filesystem. The ffs_lazy_sync routine
ensures that any files that have been accessed in the previous
30 seconds have had their access times queued for updating in the
filesystem. The qsync routine ensures that any files with modified
quotas have those quotas queued to be written back to their
associated quota file.

In a system configured with 250,000 vnodes, less than 1000 are
typically active at any point in time. Prior to this change all
250,000 vnodes would be locked and inspected twice every minute
by the syncer. For UFS/FFS filesystems they would be locked and
inspected six times every minute (twice by each of these three
routines since each of these routines does its own pass over the
vnodes associated with a mount point). With this change the syncer
now locks and inspects only the tiny set of vnodes that are active.

Reviewed by: kib
Tested by: Peter Holm
MFC after: 2 weeks

MFC: 235052 (by pluknet)

Fix mount mutex handling missed in r234386.

MFC: 235241 (by pluknet)

Fix mount interlock oversights from the previous change in r234386.

Reported by: dougb
Submitted by: Mateusz Guzik <mjguzik at gmail com>
Reviewed by: Kirk McKusick
Tested by: pho

MFC: 235246

Fix mount mutex handling missed in r234386.

MFC: 235619

Update comment to document that the vnode free-list mutex needs to be
held when updating mnt_activevnodelist and mnt_activevnodelistsize.


# 225736 22-Sep-2011 kensmith

Copy head to stable/9 as part of 9.0-RELEASE release cycle.

Approved by: re (implicit)


# 223769 04-Jul-2011 jeff

- Fix an inode quota leak. We need to decrement the quota once and only
once.

Tested by: pho
Reviewed by: mckusick


# 220985 24-Apr-2011 kib

VFS sometimes is unable to inactivate a vnode when vnode use count
goes to zero. E.g., the vnode might be only shared-locked at the time of
vput() call. Such vnodes are kept in the hash, so they can be found later.

If ffs_valloc() allocated an inode that has its vnode cached in hash, and
still owing the inactivation, then vget() call from ffs_valloc() clears
VI_OWEINACT, and then the vnode is reused for the newly allocated inode.

The problem is, the vnode is not reclaimed before it is put to the new
use. ffs_valloc() recycles vnode vm object, but this is not enough.
In particular, at least v_vflag should be cleared, and several bits of
UFS state need to be removed.

It is very inconvenient to call vgone() at this point. Instead, move
some parts of ufs_reclaim() into helper function ufs_prepare_reclaim(),
and call the helper from VOP_RECLAIM and ffs_valloc().

Reviewed by: mckusick
Tested by: pho
MFC after: 3 weeks


# 216792 29-Dec-2010 kib

Use a proper type for the variable holding the summary size of the inode
data. Otherwise, on 32bit systems, unlinked inode which size is the
multiple of 4GB was not truncated, causing corruption.

Reported by: brucec
Reviewed by: mckusick
Tested by: pho


# 215548 19-Nov-2010 kib

Remove prtactive variable and related printf()s in the vop_inactive
and vop_reclaim() methods. They seems to be unused, and the reported
situation is normal for the forced unmount.

MFC after: 1 week
X-MFC-note: keep prtactive symbol in vfs_subr.c


# 209717 06-Jul-2010 jeff

- Handle the truncation of an inode with an effective link count of 0 in
the context of the process that reduced the effective count. Previously
all truncation as a result of unlink happened in the softdep flush
thread. This had the effect of being impossible to rate limit properly
with the journal code. Now the process issuing unlinks is suspended
when the journal files. This has a side-effect of improving rm
performance by allowing more concurrent work.
- Handle two cases in inactive, one for effnlink == 0 and another when
nlink finally reaches 0.
- Eliminate the SPACECOUNTED related code since the truncation is no
longer delayed.

Discussed with: mckusick


# 183070 16-Sep-2008 kib

When downgrading the read-write mount to read-only, do_unmount() sets
MNT_RDONLY flag before the VFS_MOUNT() is called. In ufs_inactive()
and ufs_itimes_locked(), UFS verifies whether the fs is read-only by
checking MNT_RDONLY, but this may cause loss of the IN_MODIFIED flag
for inode on the fs being remounted rw->ro.

Introduce UFS_RDONLY() struct ufsmount' method that reports the value
of the fs_ronly. The later is set to 1 only after the remount is
finished.

Reviewed by: tegge
In collaboration with: pho
MFC after: 1 month


# 170991 22-Jun-2007 kib

Fix livelock that could occur when snapshoting UFS with quotas, where
some quota limit was exceeded. Sequence of UFS_VALLOC()/UFS_VFREE()
call there could cause inodeblock to have both freefile and inodedep
dependencies without any inode in the block being marked for write.
Then, softdep_check_suspend() would return EAGAIN forewer.

Force write of inodeblock with allocated freefile softdependency by
setting IN_MODIFIED flag in softdep_freefile and unconditionally calling
UFS_UPDATE() in ufs_reclaim.

Reported by: kris
Debug help and tested by: Peter Holm
Approved by: re (kensmith)
MFC after: 3 weeks


# 163841 31-Oct-2006 pjd

Add gjournal specific code to the UFS file system:
- Add FS_GJOURNAL flag which enables gjournal support on a file system.
- Add cg_unrefs field to the cylinder group structure which holds
number of unreferenced (orphaned) inodes in the given cylinder group.
- Add fs_unrefs field to the super block structure which holds
total number of unreferenced (orphaned) inodes.
- When file or a directory is orphaned (last reference is removed, but
object is still open), increase fs_unrefs and cg_unrefs fields,
which is a hint for fsck in which cylinder groups looks for such
(orphaned) objects.
- When file is last closed, decrease {fs,cg}_unrefs fields.
- Add VV_DELETED vnode flag which points at orphaned objects.

Sponsored by: home.pl


# 158382 09-May-2006 tegge

Bring the call to softdep_releasefile() within the region protected by
vn_start_secondary_write() since it might cause file system write activity
(e.g. ffs_snapremove()).


# 156560 10-Mar-2006 tegge

Block secondary writes while expunging active unlinked files.

Fix detection of active unlinked files by checking VI_OWEINACT and
VI_DOINGINACT in addition to v_usecount.

Defer inactive handling for unlinked files if the file system is mostly
suspended (secondary writes being blocked).

Perform deferred inactive handling after the file system is resumed.


# 156451 08-Mar-2006 tegge

Use vn_start_secondary_write() and vn_finished_secondary_write() as a
replacement for vn_write_suspend_wait() to better account for secondary write
processing.

Close race where secondary writes could be started after ffs_sync() returned
but before the file system was marked as suspended.

Detect if secondary writes or softdep processing occurred during vnode sync
loop in ffs_sync() and retry the loop if needed.


# 150492 23-Sep-2005 delphij

Restore a historical ufs_inactive behavior that has been changed
in rev. 1.40 of ufs_inode.c, which allows an inode being truncated
even when the filesystem itself is marked RDONLY. A subsequent
call of UFS_TRUNCATE (ffs_truncate) would panic the system as it
asserts that it can only be called when the filesystem is mounted
read-write (same changeset, rev. 1.74 of sys/ufs/ffs/ffs_inode.c).

Because ffs_mount() already takes care of sync'ing the filesystem
to disk before being downgraded to readonly, it appears to be more
desirable that we should not permit this sort of writes to disk.

This change would fix a panic that occours when read-only mounted
a corrupted filesystem and doing some file operations.

MT6/5/4 candidate

Reviewed by: mckusick


# 143743 17-Mar-2005 jeff

- Lock the clearing of v_data in ufs_reclaim() to prevent a pagefault
in ffs_lock() when it acesses v_data without the vnlock.

Sponsored by: Isilon Systems, Inc.


# 143666 15-Mar-2005 phk

Don't hold a reference on the disk vnode for each inode.


# 143613 14-Mar-2005 jeff

- Destroy the vnode object earlier in VOP_RECLAIM as we need more of
the vnode valid before the vm flushes pages.
- Get rid of some extraneous uses of the vnode interlock.

Sponsored by: Isilon Systems, Inc.


# 143562 14-Mar-2005 phk

Use vfs_hash instead of home-rolled.


# 143499 13-Mar-2005 jeff

- Don't drop the lock in ufs_inactive().
- Also in ufs_inactive, don't acquire the vnode interlock where it isn't
strictly needed. Also owning the vnode interlock while calling vprint()
will cause locking assertions to trip.

Sponsored by: Isilon Systems, Inc.


# 142079 19-Feb-2005 phk

Try to unbreak the vnode locking around vop_reclaim() (based mostly on
patch from kan@).

Pull bufobj_invalbuf() out of vinvalbuf() and make g_vfs call it on
close. This is not yet a generally safe function, but for this very
specific use it is safe. This solves the problem with buffers not
being flushed by unmount or after failed mount attempts.


# 140939 28-Jan-2005 phk

Make filesystems get rid of their own vnodes vnode_pager object in
VOP_RECLAIM().


# 140936 28-Jan-2005 phk

Remove unused argument to vrecycle()


# 139825 07-Jan-2005 imp

/* -> /*- for license, minor formatting changes


# 132775 28-Jul-2004 kan

Avoid using casts as lvalues. Introduce DIP_SET macro which sets proper
inode field based on UFS version. Use DIP ro read values and DIP_SET
to modify them throughout FFS code base.


# 127975 07-Apr-2004 imp

Remove advertising clause from University of California Regent's
license, per letter dated July 22, 1999 and irc message from Robert
Watson saying that clause 3 can be removed from those files with an
NAI copyright that also have only a University of California
copyrights.

Approved by: core, rwatson


# 120777 05-Oct-2003 jeff

- Don't cache_purge() in ufs_reclaim. vclean() does it for us so
this is redundant.


# 118969 15-Aug-2003 phk

Eliminate the i_devvp field from the incore UFS inodes, we can
get the same value from ip->i_ump->um_devvp.

This saves a pointer in the memory copies of inodes, which can
easily run into several hundred kilobytes.

The extra indirection is unmeasurable in benchmarks.

Approved by: mckusick


# 116192 11-Jun-2003 obrien

Use __FBSDID().


# 108313 27-Dec-2002 phk

Make ffs_mountfs() static.

Remove the malloctype from the ufs mount structure, instead add a callback
to the storage method for freeing inodes: UFS_IFREE().

Add vfs_ifree() method function which frees an inode.

Unvariablelize the malloc type used for allocating inodes.


# 105077 14-Oct-2002 mckusick

Regularize the vop_stdlock'ing protocol across all the filesystems
that use it. Specifically, vop_stdlock uses the lock pointed to by
vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to
reference vp->v_lock. Filesystems that wish to use the default
do not need to allocate a lock at the front of their node structure
(as some still did) or do a lockinit. They can simply start using
vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks,
but still use the vop_stdlock functions (such as nullfs) can simply
replace vp->v_vnlock with a pointer to the lock that they wish to
have used for the vnode. Such filesystems are responsible for
setting the vp->v_vnlock back to the default in their vop_reclaim
routine (e.g., vp->v_vnlock = &vp->v_lock).

In theory, this set of changes cleans up the existing filesystem
lock interface and should have no function change to the existing
locking scheme.

Sponsored by: DARPA & NAI Labs.


# 103944 25-Sep-2002 jeff

- Lock accesses to v_usecount.
- Convert interlock locks to use standard macros.


# 102774 01-Sep-2002 rwatson

Since we have vp and td cached in local variables, use those instead
of derefencing the VOP arguments again when calling the UFS code.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs


# 100344 19-Jul-2002 mckusick

Add support to UFS2 to provide storage for extended attributes.
As this code is not actually used by any of the existing
interfaces, it seems unlikely to break anything (famous
last words).

The internal kernel interface to manipulate these attributes
is invoked using two new IO_ flags: IO_NORMAL and IO_EXT.
These flags may be specified in the ioflags word of VOP_READ,
VOP_WRITE, and VOP_TRUNCATE. Specifying IO_NORMAL means that
you want to do I/O to the normal data part of the file and
IO_EXT means that you want to do I/O to the extended attributes
part of the file. IO_NORMAL and IO_EXT are mutually exclusive
for VOP_READ and VOP_WRITE, but may be specified individually
or together in the case of VOP_TRUNCATE. For example, when
removing a file, VOP_TRUNCATE is called with both IO_NORMAL
and IO_EXT set. For backward compatibility, if neither IO_NORMAL
nor IO_EXT is set, then IO_NORMAL is assumed.

Note that the BA_ and IO_ flags have been `merged' so that they
may both be used in the same flags word. This merger is possible
by assigning the IO_ flags to the low sixteen bits and the BA_
flags the high sixteen bits. This works because the high sixteen
bits of the IO_ word is reserved for read-ahead and help with
write clustering so will never be used for flags. This merge
lets us get away from code of the form:

if (ioflags & IO_SYNC)
flags |= BA_SYNC;

For the future, I have considered adding a new field to the
vattr structure, va_extsize. This addition could then be
exported through the stat structure to allow applications to
find out the size of the extended attribute storage and also
would provide a more standard interface for truncating them
(via VOP_SETATTR rather than VOP_TRUNCATE).

I am also contemplating adding a pathconf parameter (for
concreteness, lets call it _PC_MAX_EXTSIZE) which would
let an application determine the maximum size of the extended
atribute storage.

Sponsored by: DARPA & NAI Labs.


# 98788 24-Jun-2002 mckusick

Force the quota update to be done when an inode is released in
ufs_inactive. This avoid a panic when checking a NULL credential
in suser_cred().


# 98542 21-Jun-2002 mckusick

This commit adds basic support for the UFS2 filesystem. The UFS2
filesystem expands the inode to 256 bytes to make space for 64-bit
block pointers. It also adds a file-creation time field, an ability
to use jumbo blocks per inode to allow extent like pointer density,
and space for extended attributes (up to twice the filesystem block
size worth of attributes, e.g., on a 16K filesystem, there is space
for 32K of attributes). UFS2 fully supports and runs existing UFS1
filesystems. New filesystems built using newfs can be built in either
UFS1 or UFS2 format using the -O option. In this commit UFS1 is
the default format, so if you want to build UFS2 format filesystems,
you must specify -O 2. This default will be changed to UFS2 when
UFS2 proves itself to be stable. In this commit the boot code for
reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c)
as there is insufficient space in the boot block. Once the size of the
boot block is increased, this code can be defined.

Things to note: the definition of SBSIZE has changed to SBLOCKSIZE.
The header file <ufs/ufs/dinode.h> must be included before
<ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and
ufs_lbn_t.

Still TODO:
Verify that the first level bootstraps work for all the architectures.
Convert the utility ffsinfo to understand UFS2 and test growfs.
Add support for the extended attribute storage. Update soft updates
to ensure integrity of extended attribute storage. Switch the
current extended attribute interfaces to use the extended attribute
storage. Add the extent like functionality (framework is there,
but is currently never used).

Sponsored by: DARPA & NAI Labs.
Reviewed by: Poul-Henning Kamp <phk@freebsd.org>


# 96506 13-May-2002 phk

Remove register keyword.

Sponsored by: DARPA & NAI Labs.
Submitted by: mckusick


# 96010 04-May-2002 jeff

Include systm.h so panic(9) is defined when doing DEBUG_ALL_VFS_LOCKS.


# 89384 15-Jan-2002 mckusick

When downgrading a filesystem from read-write to read-only, operations
involving file removal or file update were not always being fully
committed to disk. The result was lost files or corrupted file data.
This change ensures that the filesystem is properly synced to disk
before the filesystem is down-graded.

This delta also fixes a long standing bug in which a file open for
reading has been unlinked. When the last open reference to the file
is closed, the inode is reclaimed by the filesystem. Previously,
if the filesystem had been down-graded to read-only, the inode could
not be reclaimed, and thus was lost and had to be later recovered
by fsck. With this change, such files are found at the time of the
down-grade. Normally they will result in the filesystem down-grade
failing with `device busy'. If a forcible down-grade is done, then
the affected files will be revoked causing the inode to be released
and the open file descriptors to begin failing on attempts to read.

Submitted by: "Sam Leffler" <sam@errno.com>


# 84811 11-Oct-2001 jhb

Add missing includes of sys/lock.h.


# 83366 12-Sep-2001 julian

KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after: ha ha ha ha


# 79561 10-Jul-2001 iedowse

Bring in dirhash, a simple hash-based lookup optimisation for large
directories. When enabled via "options UFS_DIRHASH", in-core hash
arrays are maintained for large directories. These allow all
directory operations to take place quickly instead of requiring
long linear searches. For now anyway, dirhash is not enabled by
default.

The in-core hash arrays have a memory requirement that is approximately
half the size of the size of the on-disk directory file. A number
of new sysctl variables allow control over which directories get
hashed and over the maximum amount of memory that dirhash will use:

vfs.ufs.dirhash_minsize
The minimum on-disk directory size for which hashing should be
used. The default is 2560 (2.5k).

vfs.ufs.dirhash_maxmem
The system-wide maximum total memory to be used by dirhash data
structures. The default is 2097152 (2MB).

The current amount of memory being used by dirhash is visible
through the read-only sysctl variable vfs.ufs.dirhash_maxmem.
Finally, some extra sanity checks that are enabled by default, but
which may have an impact on performance, can be disabled by setting
vfs.ufs.dirhash_docheck to 0.

Discussed on: -fs, -hackers


# 76357 08-May-2001 mckusick

When running with soft updates, track the number of blocks and files
that are committed to being freed and reflect these blocks in the
counts returned by statfs (and thus also by the `df' command). This
change allows programs such as those that do news expiration to
know when to stop if they are trying to create a certain percentage
of free space. Note that this change does not solve the much harder
problem of making this to-be-freed space available to applications
that want it (thus on a nearly full filesystem, you may still
encounter out-of-space conditions even though the free space will
show up eventually). Hopefully this harder problem will be the
subject of a future enhancement.


# 76117 29-Apr-2001 grog

Revert consequences of changes to mount.h, part 2.

Requested by: bde


# 75858 23-Apr-2001 grog

Correct #includes to work with fixed sys/mount.h.


# 74548 21-Mar-2001 mckusick

Add kernel support for running fsck on active filesystems.


# 74433 19-Mar-2001 rwatson

o Change options FFS_EXTATTR and options FFS_EXTATTR_AUTOSTART to
options UFS_EXTATTR and UFS_EXTATTR_AUTOSTART respectively. This change
reflects the fact that our EA support is implemented entirely at the
UFS layer (modulo FFS start/stop/autostart hooks for mount and unmount
events). This also better reflects the fact that [shortly] MFS will also
support EAs, as well as possibly IFS.

o Consumers of the EA support in FFS are reminded that as a result, they
must change kernel config files to reflect the new option names.

Obtained from: TrustedBSD Project


# 71576 24-Jan-2001 jasone

Convert all simplelocks to mutexes and remove the simplelock implementations.


# 66615 03-Oct-2000 jasone

Convert lockmgr locks from using simple locks to using mutexes.

Add lockdestroy() and appropriate invocations, which corresponds to
lockinit() and must be called to clean up after a lockmgr lock is no
longer needed.


# 63788 24-Jul-2000 mckusick

This patch corrects the first round of panics and hangs reported
with the new snapshot code.

Update addaliasu to correctly implement the semantics of the old
checkalias function. When a device vnode first comes into existence,
check to see if an anonymous vnode for the same device was created
at boot time by bdevvp(). If so, adopt the bdevvp vnode rather than
creating a new vnode for the device. This corrects a problem which
caused the kernel to panic when taking a snapshot of the root
filesystem.

Change the calling convention of vn_write_suspend_wait() to be the
same as vn_start_write().

Split out softdep_flushworklist() from softdep_flushfiles() so that
it can be used to clear the work queue when suspending filesystem
operations.

Access to buffers becomes recursive so that snapshots can recursively
traverse their indirect blocks using ffs_copyonwrite() when checking
for the need for copy on write when flushing one of their own indirect
blocks. This eliminates a deadlock between the syncer daemon and a
process taking a snapshot.

Ensure that softdep_process_worklist() can never block because of a
snapshot being taken. This eliminates a problem with buffer starvation.

Cleanup change in ffs_sync() which did not synchronously wait when
MNT_WAIT was specified. The result was an unclean filesystem panic
when doing forcible unmount with heavy filesystem I/O in progress.

Return a zero'ed block when reading a block that was not in use at
the time that a snapshot was taken. Normally, these blocks should
never be read. However, the readahead code will occationally read
them which can cause unexpected behavior.

Clean up the debugging code that ensures that no blocks be written
on a filesystem while it is suspended. Snapshots must explicitly
label the blocks that they are writing during the suspension so that
they do not cause a `write on suspended filesystem' panic.

Reorganize ffs_copyonwrite() to eliminate a deadlock and also to
prevent a race condition that would permit the same block to be
copied twice. This change eliminates an unexpected soft updates
inconsistency in fsck caused by the double allocation.

Use bqrelse rather than brelse for buffers that will be needed
soon again by the snapshot code. This improves snapshot performance.


# 62976 11-Jul-2000 mckusick

Add snapshots to the fast filesystem. Most of the changes support
the gating of system calls that cause modifications to the underlying
filesystem. The gating can be enabled by any filesystem that needs
to consistently suspend operations by adding the vop_stdgetwritemount
to their set of vnops. Once gating is enabled, the function
vfs_write_suspend stops all new write operations to a filesystem,
allows any filesystem modifying system calls already in progress
to complete, then sync's the filesystem to disk and returns. The
function vfs_write_resume allows the suspended write operations to
begin again. Gating is not added by default for all filesystems as
for SMP systems it adds two extra locks to such critical kernel
paths as the write system call. Thus, gating should only be added
as needed.

Details on the use and current status of snapshots in FFS can be
found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness
is not included here. Unless and until you create a snapshot file,
these changes should have no effect on your system (famous last words).


# 62148 27-Jun-2000 phk

Move prtactive to vfs from ufs. It is used all over the place.


# 59241 15-Apr-2000 rwatson

Introduce extended attribute support for FFS, allowing arbitrary
(name, value) pairs to be associated with inodes. This support is
used for ACLs, MAC labels, and Capabilities in the TrustedBSD
security extensions, which are currently under development.

In this implementation, attributes are backed to data vnodes in the
style of the quota support in FFS. Support for FFS extended
attributes may be enabled using the FFS_EXTATTR kernel option
(disabled by default). Userland utilities and man pages will be
committed in the next batch. VFS interfaces and man pages have
been in the repo since 4.0-RELEASE and are unchanged.

o ufs/ufs/extattr.h: UFS-specific extattr defines
o ufs/ufs/ufs_extattr.c: bulk of support routines
o ufs/{ufs,ffs,mfs}/*.[ch]: hooks and extattr.h includes
o contrib/softupdates/ffs_softdep.c: extattr.h includes
o conf/options, conf/files, i386/conf/LINT: added FFS_EXTATTR

o coda/coda_vfsops.c: XXX required extattr.h due to ufsmount.h
(This should not be the case, and will be fixed in a future commit)

Currently attributes are not supported in MFS. This will be fixed.

Reviewed by: adrian, bp, freebsd-fs, other unthanked souls
Obtained from: TrustedBSD Project


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 42374 07-Jan-1999 bde

Don't pass unused unused timestamp args to UFS_UPDATE() or waste
time initializing them. This almost finishes centralizing (in-core)
timestamp updates in ufs_itimes().


# 37363 03-Jul-1998 bde

Sync timestamp changes for inodes of special files to disk as late
as possible (when the inode is reclaimed). Temporarily only do
this if option UFS_LAZYMOD configured and softupdates aren't enabled.
UFS_LAZYMOD is intentionally left out of /sys/conf/options.

This is mainly to avoid almost useless disk i/o on battery powered
machines. It's silly to write to disk (on the next sync or when the
inode becomes inactive) just because someone hit a key or something
wrote to the screen or /dev/null.

PR: 5577
Previous version reviewed by: phk


# 34961 30-Mar-1998 phk

Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time. Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by: bde


# 31486 02-Dec-1997 bde

`nextgennumber' can go away now that is no longer (ab)used by foreign
fs's.


# 30492 16-Oct-1997 phk

Another VFS cleanup "kilo commit"

1. Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS}
intereface function, and now lives in the ufsmount structure.

2. Remove VOP_SEEK, it was unused.

3. Add mode default vops:

VOP_ADVLOCK vop_einval
VOP_CLOSE vop_null
VOP_FSYNC vop_null
VOP_IOCTL vop_enotty
VOP_MMAP vop_einval
VOP_OPEN vop_null
VOP_PATHCONF vop_einval
VOP_READLINK vop_einval
VOP_REALLOCBLKS vop_eopnotsupp

And remove identical functionality from filesystems

4. Add vop_stdpathconf, which returns the canonical stuff. Use
it in the filesystems. (XXX: It's probably wrong that specfs
and fifofs sets this vop, shouldn't it come from the "host"
filesystem, for instance ufs or cd9660 ?)

5. Try to make system wide VOP functions have vop_* names.

6. Initialize the um_* vectors in LFS.

(Recompile your LKMS!!!)


# 30474 16-Oct-1997 phk

VFS mega cleanup commit (x/N)

1. Add new file "sys/kern/vfs_default.c" where default actions for
VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE,
POLL, REVOKE and STRATEGY. Various stuff spread over the entire
tree belongs here.

2. Change VOP_BLKATOFF to a normal function in cd9660.

3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These
are private interface functions between UFS and the underlying
storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now
live in struct ufsmount instead.

4. Remove a kludge of VOP_ functions in all filesystems, that did
nothing but obscure the simplicity and break the expandability.
If a filesystem doesn't implement VOP_FOO, it shouldn't have an
entry for it in its vnops table. The system will try to DTRT
if it is not implemented. There are still some cruft left, but
the bulk of it is done.

5. Fix another VCALL in vfs_cache.c (thanks Bruce!)


# 30418 14-Oct-1997 phk

I think my previous change may have opened a race conditio.
This patch does the same thing, with no change in semantics.


# 30402 14-Oct-1997 phk

ufs_ihashrem() should not be called from the UFS layer, but from the
lower layer (LFS/FFS/?) like the rest of the ihash functions.
Otherwise it is impossible to make a lower layer that doesn't use the
ihash facility.


# 30285 10-Oct-1997 phk

Make ufs_reclaim free the underlying inode.


# 29041 02-Sep-1997 bde

Removed unused #includes.


# 28774 26-Aug-1997 dyson

Back out some incorrect changes that was worse than the original bug.


# 28558 22-Aug-1997 dyson

This is a trial improvement for the vnode reference count while on the vnode
free list problem. Also, the vnode age flag is no longer used by the
vnode pager. (It is actually incorrect to use then.) Constructive
feedback welcome -- just be kind.


# 24101 22-Mar-1997 bde

Fixed some invalid (non-atomic) accesses to `time', mostly ones of the
form `tv = time'. Use a new function gettime(). The current version
just forces atomicicity without fixing precision or efficiency bugs.
Simplified some related valid accesses by using the central function.


# 22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 22521 10-Feb-1997 dyson

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 13260 05-Jan-1996 wollman

Convert QUOTA to new-style option.


# 12117 05-Nov-1995 dyson

Changes to existing files for ext2fs support. The UFS mods need rework
in the future as they are a bit crufty -- but at least the stuff is in the
tree now.


# 5392 04-Jan-1995 gibbs

Change panic messges that are ffs_blah functions to say they are ffs not
ufs functions.


# 3604 15-Oct-1994 ache

Add back variable declaration removed by wrong prevous cleanups.


# 3427 08-Oct-1994 phk

POSSIBLE BOGUS CODE found, (related to dos-partitions) in ufs_disksubr.c,
look for CC_WALL.
Cosmetics, a couple of unused vars.


# 2112 18-Aug-1994 wollman

Fix up some sloppy coding practices:

- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.

NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.


# 1817 02-Aug-1994 dg

Added $Id$


# 1542 24-May-1994 rgrimes

This commit was generated by cvs2svn to compensate for changes in r1541,
which included commits to RCS files with non-trunk default branches.


# 1541 24-May-1994 rgrimes

BSD 4.4 Lite Kernel Sources