#
267654 |
|
19-Jun-2014 |
gjb |
Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
262780 |
|
05-Mar-2014 |
pfg |
MFC r262678; ufs: small formatting fixes.
Cleanup some extra space. Use of tabs vs. spaces. No functional change.
Reviewed by: mckusick
|
#
259224 |
|
11-Dec-2013 |
pfg |
MFC r256448, r257029;
Make di_blocks unsigned in UFS1 as is the case already for UFS2. Most of the code between UFS1 and UFS2 is shared so this change is pretty safe. Not only this makes UFS1 and 2 consistent but it also matches what NetBSD and MacOS X have for some years now.
UFS2: make di_extsize unsigned. di_extsize is the EA size and as such it should be unsigned. Adjust related types for consistency.
Reviewed by: mckusick
|
#
251897 |
|
18-Jun-2013 |
scottl |
Merge the second part of the unmapped I/O changes. This enables the infrastructure in the block layer and UFS filesystem as well as a few drivers. The list of MFC revisions is long, so I won't quote changelogs.
r248508,248510,248511,248512,248514,248515,248516,248517,248518, 248519,248520,248521,248550,248568,248789,248790,249032,250936
Submitted by: kib Approved by: kib Obtained from: Netflix
|
#
251144 |
|
30-May-2013 |
scottl |
MFC r248282, in a modified fashion. From the original changelog:
Add currently unused flag argument to the cluster_read(), cluster_write() and cluster_wbuild() functions. The flags to be allowed are a subset of the GB_* flags for getblk().
This merge adds a cluster_*_gb() API variant instead of changing the ABI with an added argument to the existing API. Most API consumers that were changed in the original rev have been left un-changed in this merge to reduce churn. The mergeinfo is recorded though for future merging convenience. This is effectively a no-op for the moment.
Submitted by: kib, FF Obtained from: Netflix
|
#
240238 |
|
08-Sep-2012 |
kib |
MFC r239065: Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it.
|
#
239850 |
|
29-Aug-2012 |
kib |
MFC r236322: Enable vn_io_fault() lock avoidance for UFS.
|
#
237713 |
|
28-Jun-2012 |
kib |
Fix unbounded-length malloc, controlled from usermode. The added check is performed before exact size of the buffer is calculated, but the buffer cannot have size greater then the total space allocated for extended attributes. The existing check is executing with precise size, but it is too late, since buffer needs to be allocated in advance.
Also, adapt to uio_resid being of ssize_t type. Use lblktosize instead of multiplying by fs block size by hand as well.
|
#
233654 |
|
29-Mar-2012 |
kib |
MFC r232948: Supply boolean as the second argument to ffs_update(), and not a MNT_[NO]WAIT constants, which in fact always caused sync operation.
|
#
233630 |
|
28-Mar-2012 |
mckusick |
MFC of 232351, 233438, and 233629
MFC reviewed by: kib
MFC 232351:
This change avoids a kernel deadlock on "snaplk" when using snapshots on UFS filesystems running with journaled soft updates. This is the first of several bugs that need to be fixed before removing the restriction added in -r230250 to prevent the use of snapshots on filesystems running with journaled soft updates.
The deadlock occurs when holding the snapshot lock (snaplk) and then trying to flush an inode via ffs_update(). We become blocked by another process trying to flush a different inode contained in the same inode block that we need. It holds the inode block for which we are waiting locked. When it tries to write the inode block, it gets blocked waiting for the our snaplk when it calls ffs_copyonwrite() to see if the inode block needs to be copied in our snapshot.
The most obvious place that this deadlock arises is in the ffs_copyonwrite() routine when it updates critical metadata in a snapshot and tries to write it out before proceeding. The fix here is to write the data and indirect block pointer for the snapshot, but to skip the call to ffs_update() to write the snapshot inode. To ensure that we will never have to update a pointer in the inode itself, the ffs_snapshot() routine that creates the snapshot has to ensure that all the direct blocks are allocated as part of the creation of the snapshot.
A less obvious place that this deadlock occurs is when we hold the snaplk because we are deleting a snapshot. In the course of doing the deletion, we need to allocate various soft update dependency structures and allocate some journal space. If we hit a resource limit while doing this we decrease the resources in use by flushing out an existing dirty file to get it to give up the soft dependency resources that it holds. The flush can cause an ffs_update() to be done on the inode for the file that we have selected to flush resulting in the same deadlock as described above when the inode that we have chosen to flush resides in the same inode block as the snapshot inode that we hold. The fix is to defer cleaning up any time that the inode on which we are operating is a snapshot.
Help and review by: Jeff Roberson Tested by: Peter Holm
MFC 233438:
Add a third flags argument to ffs_syncvnode to avoid a possible conflict with MNT_WAIT flags that passed in its second argument.
Discussed with: kib
MFC 233629:
A refinement of change 232351 to avoid a race with a forcible unmount. While we have a snapshot vnode unlocked to avoid a deadlock with another inode in the same inode block being updated, the filesystem containing it may be forcibly unmounted. When that happens the snapshot vnode is revoked. We need to check for that condition and fail appropriately.
Spotted by: kib Reviewed by: kib
|
#
233439 |
|
24-Mar-2012 |
kib |
MFC r232834: In ffs_syncvnode(), pass boolean false as second argument of ffs_update(). Synchronous inode block update is not needed for MNT_LAZY callers (syncer), and since waitfor values are not zero, code did unneccessary synchronous update.
|
#
233353 |
|
23-Mar-2012 |
kib |
MFC r231949: Fix found places where uio_resid is truncated to int.
Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode.
MFC r232493: Remove unneeded cast to u_int. The values as small enough to fit into int, beside the use of MIN macro which performs type promotions.
MFC r232494: Instead of incomplete handling of read(2)/write(2) return values that does not fit into registers, declare that we do not support this case using CTASSERT(), and remove endianess-unsafe code to split return value into td_retval.
While there, change the style of the sysctl debug.iosize_max_clamp definition.
MFC r232495: pipe_read(): change the type of size to int, and remove signed clamp. pipe_write(): change the type of desiredsize back to int, its value fits.
|
#
232956 |
|
14-Mar-2012 |
kib |
MFC r232833: Remove not needed ARGSUSED lint command.
|
#
231953 |
|
20-Feb-2012 |
kib |
MFC r231313 (by mckusick): First attempt the uiomove() to the newly allocated (and dirty) buffer and only zeros it if the uiomove() fails. The effect is to eliminate the gratuitous zeroing of the buffer in the usual case where the uiomove() successfully fills it.
|
#
225736 |
|
22-Sep-2011 |
kensmith |
Copy head to stable/9 as part of 9.0-RELEASE release cycle.
Approved by: re (implicit)
|
#
224503 |
|
29-Jul-2011 |
mckusick |
Update to -r224294 to ensure that only one of MNT_SUJ or MNT_SOFTDEP is set so that mount can revert back to using MNT_NOWAIT when doing getmntinfo.
Approved by: re (kib)
|
#
222958 |
|
10-Jun-2011 |
jeff |
Implement fully asynchronous partial truncation with softupdates journaling to resolve errors which can cause corruption on recovery with the old synchronous mechanism.
- Append partial truncation freework structures to indirdeps while truncation is proceeding. These prevent new block pointers from becoming valid until truncation completes and serialize truncations. - On completion of a partial truncate journal work waits for zeroed pointers to hit indirects. - softdep_journal_freeblocks() handles last frag allocation and last block zeroing. - vtruncbuf/ffs_page_remove moved into softdep_*_freeblocks() so it is only implemented in one place. - Block allocation failure handling moved up one level so it does not proceed with buf locks held. This permits us to do more extensive reclaims when filesystem space is exhausted. - softdep_sync_metadata() is broken into two parts, the first executes once at the start of ffs_syncvnode() and flushes truncations and inode dependencies. The second is called on each locked buf. This eliminates excessive looping and rollbacks. - Improve the mechanism in process_worklist_item() that handles acquiring vnode locks for handle_workitem_remove() so that it works more generally and does not loop excessively over the same worklist items on each call. - Don't corrupt directories by zeroing the tail in fsck. This is only done for regular files. - Push a fsync complete record for files that need it so the checker knows a truncation in the journal is no longer valid.
Discussed with: mckusick, kib (ffs_pages_remove and ffs_truncate parts) Tested by: pho
|
#
221281 |
|
30-Apr-2011 |
kib |
Fix typos.
Noted by: Fabian Keil <freebsd-listen fabiankeil de> Pointy hat to: kib MFC after: 1 week
|
#
221261 |
|
30-Apr-2011 |
kib |
Clarify the comment.
MFC after: 1 week
|
#
209717 |
|
06-Jul-2010 |
jeff |
- Handle the truncation of an inode with an effective link count of 0 in the context of the process that reduced the effective count. Previously all truncation as a result of unlink happened in the softdep flush thread. This had the effect of being impossible to rate limit properly with the journal code. Now the process issuing unlinks is suspended when the journal files. This has a side-effect of improving rm performance by allowing more concurrent work. - Handle two cases in inactive, one for effnlink == 0 and another when nlink finally reaches 0. - Eliminate the SPACECOUNTED related code since the truncation is no longer delayed.
Discussed with: mckusick
|
#
207728 |
|
06-May-2010 |
alc |
Eliminate page queues locking around most calls to vm_page_free().
|
#
207669 |
|
05-May-2010 |
alc |
Acquire the page lock around all remaining calls to vm_page_free() on managed pages that didn't already have that lock held. (Freeing an unmanaged page, such as the various pmaps use, doesn't require the page lock.)
This allows a change in vm_page_remove()'s locking requirements. It now expects the page lock to be held instead of the page queues lock. Consequently, the page queues lock is no longer required at all by callers to vm_page_rename().
Discussed with: kib
|
#
207662 |
|
05-May-2010 |
trasz |
Move checking against RLIMIT_FSIZE into one place, vn_rlimit_fsize().
Reviewed by: kib
|
#
207141 |
|
24-Apr-2010 |
jeff |
- Merge soft-updates journaling from projects/suj/head into head. This brings in support for an optional intent log which eliminates the need for background fsck on unclean shutdown.
Sponsored by: iXsystems, Yahoo!, and Juniper. With help from: McKusick and Peter Holm
|
#
195265 |
|
01-Jul-2009 |
trasz |
Don't panic on attempt to set ACL on a block device file. This is just a part of kern/125613.
PR: kern/125613 Submitted by: Jaakko Heinonen <jh at saunalahti dot fi> Reviewed by: rwatson Approved by: re (kib)
|
#
195187 |
|
30-Jun-2009 |
kib |
For SU mounts, softdep_fsync() might drop vnode lock, allowing other threads to put dirty buffers on the vnode bufobj list. For regular files and synchronous fsync requests, check for the condition and restart the fsync vop if a new dirty buffer arrived.
Tested by: pho Approved by: re (kensmith) MFC after: 1 month
|
#
190469 |
|
27-Mar-2009 |
kib |
Correct typo.
Noted by: kensmith
|
#
189737 |
|
12-Mar-2009 |
kib |
The non-modifying EA VOPs are executed with only shared vnode lock taken. Provide a custom lock around initializing and tearing down EA area, to prevent both memory leaks and double-free of it. Count the number of EA area accessors.
Lock protocol requires either holding exclusive vnode lock to modify i_ea_area, or shared vnode lock and owning IN_EA_LOCKED flag in i_flag.
Noted by: YAMAMOTO, Taku <taku tackymt homeip net> Tested by: pho (previous version) MFC after: 2 weeks
|
#
187790 |
|
27-Jan-2009 |
rwatson |
Following a fair amount of real world experience with ACLs and extended attributes since FreeBSD 5, make the following semantic changes:
- Don't update the inode modification time (mtime) when extended attributes (and hence also ACLs) are added, modified, or removed. - Don't update the inode access tie (atime) when extended attributes (and hence also ACLs) are queried.
This means that rsync (and related tools) won't improperly think that the data in the file has changed when only the ACL has changed.
Note that ffs_reallocblks() has not been changed to not update on an IO_EXT transaction, but currently EAs don't use the cluster write routines so this shouldn't be a problem. If EAs grow support for clustering, then VOP_REALLOCBLKS() will need to grow a flag argument to carry down IO_EXT to UFS.
MFC after: 1 week PR: ports/125739 Reported by: Alexander Zagrebin <alexz@visp.ru> Tested by: pluknet <pluknet@gmail.com>, Greg Byshenk <freebsd@byshenk.net> Discussed with: kib, kientzle, timur, Alexander Bokovoy <ab@samba.org>
|
#
187468 |
|
20-Jan-2009 |
kib |
When extending inode size, we call vnode_pager_setsize(), to have a address space where to put vnode pages, and then call UFS_BALLOC(), to actually allocate new block and map it. When UFS_BALLOC() returns error, sometimes we forget to revert the vm object size increase, allowing for the pages that are not backed by the logical disk blocks.
Revert vnode_pager_setsize() back when UFS_BALLOC() failed, for ffs_truncate() and ffs_write().
PR: 129956 Reviewed by: ups MFC after: 3 weeks
|
#
184074 |
|
20-Oct-2008 |
kib |
Assert that v_holdcnt is non-zero before entering lockmgr in vn_lock and ffs_lock. This cannot catch situations where holdcnt is incremented not by curthread, but I think it is useful.
Reviewed by: tegge, attilio Tested by: pho MFC after: 2 weeks
|
#
182721 |
|
03-Sep-2008 |
trasz |
When calling extattr_check_cred, use V{READ,WRITE}, not I{READ,WRITE}.
Approved by: rwatson (mentor)
|
#
177779 |
|
31-Mar-2008 |
jeff |
- Since rev 1.142 of ffs_snapshot.c the interlock has not been required to protect the v_lock pointer. Removing the interlock acquisition here allows vn_lock() to proceed without requiring the interlock at all. - If the lock mutated while we were sleeping on it the interlock has been dropped. It is conceivable that the upper layer code was relying on the interlock and LK_NOWAIT to protect the identity or state of the vnode while acquiring the lock. In this case return EBUSY rather than trying the new lock to prevent potential races.
Reviewed by: tegge
|
#
177493 |
|
22-Mar-2008 |
jeff |
- Complete part of the unfinished bufobj work by consistently using BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find.
Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)
|
#
177474 |
|
21-Mar-2008 |
kib |
Reduce the acquisition of the vnode interlock in the ffs_read() and ffs_extread() when setting the IN_ACCESS flag by checking whether the IN_ACCESS is already set. The possible race there is admissible.
Tested by: pho Submitted by: jeff
|
#
176564 |
|
25-Feb-2008 |
keramida |
Minor typo nit.
|
#
176320 |
|
15-Feb-2008 |
attilio |
- Introduce lockmgr_args() in the lockmgr space. This function performs the same operation of lockmgr() but accepting a custom wmesg, prio and timo for the particular lock instance, overriding default values lkp->lk_wmesg, lkp->lk_prio and lkp->lk_timo. - Use lockmgr_args() in order to implement BUF_TIMELOCK() - Cleanup BUF_LOCK() - Remove LK_INTERNAL as it is nomore used in the lockmgr namespace
Tested by: Andrea Barberio <insomniac at slackware dot it>
|
#
175635 |
|
24-Jan-2008 |
attilio |
Cleanup lockmgr interface and exported KPI: - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present
Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h
KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits.
Tested by: matteo
|
#
175294 |
|
13-Jan-2008 |
attilio |
VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary.
KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed.
Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
|
#
175053 |
|
01-Jan-2008 |
obrien |
style(9)
|
#
173464 |
|
08-Nov-2007 |
obrien |
Turn most ffs 'DIAGNOSTIC's into INVARIANTS.
|
#
171437 |
|
13-Jul-2007 |
rodrigc |
Perform range check before allocating memory when reading extended attributes.
Reviewed by: kib Approved by: re (hrs) PR: 114389
|
#
170587 |
|
11-Jun-2007 |
rwatson |
Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present.
Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c.
We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h.
Reviewed by: csjp Obtained from: TrustedBSD Project
|
#
169671 |
|
18-May-2007 |
kib |
Since renaming of vop_lock to _vop_lock, pre- and post-condition function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.
|
#
168353 |
|
04-Apr-2007 |
delphij |
Use *_EMPTY macros when appropriate.
|
#
167719 |
|
19-Mar-2007 |
brian |
When we write extended attributes, assert that the inode hasn't already been deleted. The assertion is important to show that we won't end up accounting for extended attribute blocks (using fs_pendingblocks) in our subsequent call to fs_alloc().
Agreed verbally by: mckusick
MFC after: 3 weeks
|
#
167155 |
|
01-Mar-2007 |
pjd |
Fix build breakage.
|
#
167152 |
|
01-Mar-2007 |
pjd |
Rename PRIV_VFS_CLEARSUGID to PRIV_VFS_RETAINSUGID, which seems to better describe the privilege.
OK'ed by: rwatson
|
#
167151 |
|
01-Mar-2007 |
pjd |
Avoid checking for privileges if there is no need to.
Discussed with: rwatson
|
#
166864 |
|
21-Feb-2007 |
mckusick |
The functions that set and delete external attributes must check that the filesystem is not mounted read-only before proceeding.
Reported by: Ryan Beasley <ryanb@FreeBSD.org> MFC after: 1 week
|
#
166774 |
|
15-Feb-2007 |
pjd |
Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic.
Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation.
VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE.
Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs
|
#
164248 |
|
13-Nov-2006 |
kmacy |
change vop_lock handling to allowing tracking of callers' file and line for acquisition of lockmgr locks
Approved by: scottl (standing in for mentor rwatson)
|
#
164033 |
|
06-Nov-2006 |
rwatson |
Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking.
Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>
|
#
163194 |
|
10-Oct-2006 |
kib |
Do not translate the IN_ACCESS inode flag into the IN_MODIFIED while filesystem is suspending/suspended. Doing so may result in deadlock. Instead, set the (new) IN_LAZYACCESS flag, that becomes IN_MODIFIED when suspend is lifted.
Change the locking protocol in order to set the IN_ACCESS and timestamps without upgrading shared vnode lock to exclusive (see comments in the inode.h). Before that, inode was modified while holding only shared lock.
Tested by: Peter Holm Reviewed by: tegge, bde Approved by: pjd (mentor) MFC after: 3 weeks
|
#
158321 |
|
05-May-2006 |
tegge |
Avoid locking overhead when snapshots are disabled.
|
#
158259 |
|
02-May-2006 |
tegge |
Close a race when VOP_LOCK() on a snapshot file is attempted at the same time as it is changed back into a normal file. The locker would get the shared "snaplk" lock which would no longer be the correct lock for the vnode.
|
#
151184 |
|
09-Oct-2005 |
tegge |
Adjust totread argument passed to cluster_read() to account for offset not being block aligned.
|
#
147198 |
|
09-Jun-2005 |
ssouhlal |
Allow EVFILT_VNODE events to work on every filesystem type, not just UFS by: - Making the pre and post hooks for the VOP functions work even when DEBUG_VFS_LOCKS is not defined. - Moving the KNOTE activations into the corresponding VOP hooks. - Creating a MNTK_NOKNOTE flag for the mnt_kern_flag field of struct mount that permits filesystems to disable the new behavior. - Creating a default VOP_KQFILTER function: vfs_kqfilter()
My benchmarks have not revealed any performance degradation.
Reviewed by: jeff, bde Approved by: rwatson, jmg (kqueue changes), grehan (mentor)
|
#
144373 |
|
31-Mar-2005 |
jeff |
- Set LK_NOSHARE for snapshot locks. snapshots require exclusive only access. - Remove the hack from ffs_lock() to implement LK_NOSHARE in a ffs specific way.
Sponsored by: Isilon Systems, Inc.
|
#
143504 |
|
13-Mar-2005 |
jeff |
- It is not legal to access v_data without the vnode lock or interlock held. Grab the vnode interlock if LK_INTERLOCK has not been passed in so that we can inspect v_data in ffs_lock().
Sponsored by: Isilon Systems, Inc.
|
#
141542 |
|
08-Feb-2005 |
phk |
Split the vop_vector for ffs1 and ffs2, this is mostly for the different EXTATTR support.
|
#
141533 |
|
08-Feb-2005 |
phk |
Drag another softupdates tentacle back into FFS: Now that FFS's vop_fsync is separate from the internal use we can do the full job there.
|
#
141526 |
|
08-Feb-2005 |
phk |
Don't use the UFS_* and VFS_* functions where a direct call is possble.
The UFS_ functions are for UFS to call back into VFS. The VFS functions are external entry points into the filesystem.
|
#
141525 |
|
08-Feb-2005 |
phk |
(forced commit to record correct commit message)
Split ffs_fsync() into a VOP_FSYNC() component and an internal part called ffs_syncvnode().
Eliminate unnecessary thread argument and XXX'ed curthread passes for same. Reduce softdep_sync_metadata() from a struct vop_fsync_args to just the vnode argument it needs.
Convert internal VOP_FSYNC() calls to use ffs_syncvnode().
|
#
141522 |
|
08-Feb-2005 |
phk |
For snapshots we need all VOP_LOCKs to be exclusive.
The "business class upgrade" was implemented in UFS's VOP_LOCK implementation ufs_lock() which is the wrong layer, so move it to ffs_lock().
Also, as long as we have not abandonned advanced vfs-stacking we should not preclude it from happening: instead of implementing a copy locally, use the VOP_LOCK_APV(&ufs) to correctly arrive at vop_stdlock() at the bottom.
|
#
141521 |
|
08-Feb-2005 |
phk |
For snapshots we need all VOP_LOCKs to be exclusive.
The "business class upgrade" was implemented in UFS's VOP_LOCK implementation ufs_lock() which is the wrong layer, so move it to ffs_lock().
Also, as long as we have not abandonned advanced vfs-stacking we should not preclude it from happening: instead of implementing a copy locally, use the VOP_LOCK_APV(&ufs) to correctly arrive at vop_stdlock() at the bottom.
|
#
141520 |
|
08-Feb-2005 |
phk |
Use VOP_STRATEGY_APV() instead of direct dereference, this is more correct.
|
#
140707 |
|
24-Jan-2005 |
jeff |
- Remove GIANT_REQUIRED where giant is no longer required.
Sponsored By: Isilon Systems, Inc.
|
#
139825 |
|
07-Jan-2005 |
imp |
/* -> /*- for license, minor formatting changes
|
#
138869 |
|
14-Dec-2004 |
phk |
white space
|
#
138290 |
|
01-Dec-2004 |
phk |
Back when VOP_* was introduced, we did not have new-style struct initializations but we did have lofty goals and big ideals.
Adjust to more contemporary circumstances and gain type checking.
Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place.
Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc.
Give coda correct prototypes and function definitions for all vop_()s.
Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods.
Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector.
Remove a lot of vfs_init since vop_vector is ready to use from the compiler.
Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc.
Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse.
Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)
|
#
138270 |
|
01-Dec-2004 |
phk |
Mechanically change prototypes for vnode operations to use the new typedefs.
|
#
137846 |
|
18-Nov-2004 |
jeff |
- Eliminate the acquisition and release of the bqlock in bremfree() by setting the B_REMFREE flag in the buf. This is done to prevent lock order reversals with code that must call bremfree() with a local lock held. This also reduces overhead by removing two lock operations per buf for fsync() and similar. - Check for the B_REMFREE flag in brelse() and bqrelse() after the bqlock has been acquired so that we may remove ourself from the free-list. - Provide a bremfreef() function to immediately remove a buf from a free-list for use only by NFS. This is done because the nfsclient code overloads the b_freelist queue for its own async. io queue. - Simplify the numfreebuffers accounting by removing a switch statement that executed the same code in every possible case. - getnewbuf() can encounter locked bufs on free-lists once Giant is removed. Remove a panic associated with this condition and delay asserts that inspect the buf until after it is locked.
Reviewed by: phk Sponsored by: Isilon Systems, Inc.
|
#
137035 |
|
29-Oct-2004 |
phk |
Move UFS from DEVFS backing to GEOM backing.
This eliminates a bunch of vnode overhead (approx 1-2 % speed improvement) and gives us more control over the access to the storage device.
Access counts on the underlying device are not correctly tracked and therefore it is possible to read-only mount the same disk device multiple times: syv# mount -p /dev/md0 /var ufs rw 2 2 /dev/ad0 /mnt ufs ro 1 1 /dev/ad0 /mnt2 ufs ro 1 1 /dev/ad0 /mnt3 ufs ro 1 1
Since UFS/FFS is not a synchrousely consistent filesystem (ie: it caches things in RAM) this is not possible with read-write mounts, and the system will correctly reject this.
Details:
Add a geom consumer and a bufobj pointer to ufsmount.
Eliminate the vnode argument from softdep_disk_prewrite(). Pick the vnode out of bp->b_vp for now. Eventually we should find it through bp->b_bufobj->b_private.
In the mountcode, use g_vfs_open() once we have used VOP_ACCESS() to check permissions.
When upgrading and downgrading between r/o and r/w do the right thing with GEOM access counts. Remove all the workarounds for not being able to do this with VOP_OPEN().
If we are the root mount, drop the exclusive access count until we upgrade to r/w. This allows fsck of the root filesystem and the MNT_RELOAD to work correctly.
Set bo_private to the GEOM consumer on the device bufobj.
Change the ffs_ops->strategy function to call g_vfs_strategy()
In ufs_strategy() directly call the strategy on the disk bufobj. Same in rawread.
In ffs_fsync() we will no longer see VCHR device nodes, so remove code which synced the filesystem mounted on it, in case we came there. I'm not sure this code made sense in the first place since we would have taken the specfs route on such a vnode.
Redo the highly bogus readblock() function in the snapshot code to something slightly less bogus: Constructing an uio and using physio was really quite a detour. Instead just fill in a bio and ship it down.
|
#
136988 |
|
27-Oct-2004 |
phk |
Eliminate unnecessary KASSERTS.
|
#
136943 |
|
25-Oct-2004 |
phk |
Loose the v_dirty* and v_clean* alias macros.
Check the count field where we just want to know the full/empty state, rather than using TAILQ_EMPTY() or TAILQ_FIRST().
|
#
136751 |
|
21-Oct-2004 |
phk |
Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAIT
Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup().
Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible.
Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.
|
#
136721 |
|
20-Oct-2004 |
rwatson |
Explicitly break out NETA license from Berkeley license to clearly indicate license grant, as well as to indicate that NETA is asserting only two clauses, not four clauses.
Requested by: imp
|
#
135877 |
|
28-Sep-2004 |
phk |
Remove support for accessing device nodes in UFS/FFS.
Device nodes can still be created and exported with NFS.
|
#
135858 |
|
27-Sep-2004 |
phk |
Give cluster_write() an explicit vnode argument.
In the future a struct buf will not automatically point out a vnode for us.
|
#
135459 |
|
19-Sep-2004 |
phk |
The getpages VOP was a good stab at getting scatter/gather I/O without too much kernel copying, but it is not the right way to do it, and it is in the way for straightening out the buffer cache.
The right way is to pass the VM page array down through the struct bio to the disk device driver and DMA directly in to/out off the physical memory. Once the VM/buf thing is sorted out it is next on the list.
Retire most of vnode method. ffs_getpages(). It is not clear if what is left shouldn't be in the default implementation which we now fall back to.
Retire specfs_getpages() as well, as it has no users now.
|
#
133741 |
|
15-Aug-2004 |
jmg |
Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops.
Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases).
Reviewed by: green, rwatson (both earlier versions)
|
#
132775 |
|
28-Jul-2004 |
kan |
Avoid using casts as lvalues. Introduce DIP_SET macro which sets proper inode field based on UFS version. Use DIP ro read values and DIP_SET to modify them throughout FFS code base.
|
#
132653 |
|
26-Jul-2004 |
cperciva |
Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags.
The old name is still defined, but will be removed in a few days (unless I hear any complaints...)
Discussed with: rwatson, scottl Requested by: jhb
|
#
129545 |
|
21-May-2004 |
kensmith |
Upon further review it was decided this piece of the msync(2) fixes was applicable to HEAD, originally it was thought this should only be done in RELENG_4. Implement IO_INVAL in the vnode op for writing by marking the buffer as "no cache". This fix has already been applied to RELENG_4 as Rev. 1.65.2.15 of ufs/ufs/ufs_readwrite.c.
Reviewed by: alc, tegge
|
#
128006 |
|
07-Apr-2004 |
bde |
Record where half the bits in this file came from (from ufs_readwrite.c). Damage to history from moving bits was especially large since a repo copy is not feasible for partial files.
|
#
127975 |
|
07-Apr-2004 |
imp |
Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and irc message from Robert Watson saying that clause 3 can be removed from those files with an NAI copyright that also have only a University of California copyrights.
Approved by: core, rwatson
|
#
125710 |
|
11-Feb-2004 |
bde |
Removed more vestiges of vfs_ioopt: - rev.1.42 of ffs_readwrite.c added a special case in ffs_read() for reads that are initially at EOF, and rev.1.62 of ufs_readwrite.c fixed timestamp bugs in it. Removal of most of vfs_ioopt made it just and optimization, and removal of the vm object reference calls made it less than an optimization. It was cloned in rev.1.94 of ufs_readwrite.c as part of cloning ffs_extwrite() although it was always less than an optimization in ffs_extwrite(). - some comments, compound statements and vertical whitespace were vestiges of dead code.
|
#
125454 |
|
04-Feb-2004 |
jhb |
Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists.
Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64
|
#
125259 |
|
31-Jan-2004 |
alc |
Remove unnecessary vm object reference and deallocate calls from ffs_read() and ffs_write(). These calls trace their origins to the dead vfs_ioopt code, first appearing in revision 1.39 of ufs_readwrite.c.
Observed by: bde Discussed with: tegge
|
#
125079 |
|
27-Jan-2004 |
ache |
Turn uio_resid/uio_offset comments into KASSERTs
Reviewed by: bde
|
#
124857 |
|
23-Jan-2004 |
ache |
Copy comment about caller check from ffs_read to ffs_extread, don't check for uio_resid < 0 here too.
|
#
124856 |
|
23-Jan-2004 |
ache |
Fix various panic() strings to reflect true function name to allow easy grep. Small code reorganization to look more logic. Copy ffs_write check from prev. commit to ffs_extwrite.
|
#
124855 |
|
23-Jan-2004 |
ache |
ffs_read: Replace wrong check returned EFBIG with EOVERFLOW handling from POSIX:
36708 [EOVERFLOW] The file is a regular file, nbyte is greater than 0, the starting position is before the end-of-file, and the starting position is greater than or equal to the offset maximum established in the open file description associated with fildes.
ffs_write: Replace u_int64_t cast with uoff_t cast which is more natural for types used.
ffs_write & ffs_read: Remove uio_offset and uio_resid checks for negative values, the caller supposed to do it already. Add comments about it.
Reviewed by: bde
|
#
124728 |
|
19-Jan-2004 |
kan |
Spell magic '16' number as IO_SEQSHIFT.
|
#
120763 |
|
04-Oct-2003 |
alc |
Synchronize access to a vm page's valid field using the containing vm object's lock.
|
#
118607 |
|
07-Aug-2003 |
jhb |
Consistently use the BSD u_int and u_short instead of the SYSV uint and ushort. In most of these files, there was a mixture of both styles and this change just makes them self-consistent.
Requested by: bde (kern_ktrace.c)
|
#
118131 |
|
28-Jul-2003 |
rwatson |
Rename VOP_RMEXTATTR() to VOP_DELETEEXTATTR() for consistency with the kernel ACL interfaces and system call names.
Break out UFS2 and FFS extattr delete and list vnode operations from setextattr and getextattr to deleteextattr and listextattr, which cleans up the implementations, and makes the results more readable, and makes the APIs more clear.
Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
116423 |
|
15-Jun-2003 |
alc |
Lock the vm object when freeing pages.
|
#
116412 |
|
15-Jun-2003 |
phk |
Add the same KASSERT to all VOP_STRATEGY and VOP_SPECSTRATEGY implementations to check that the buffer points to the correct vnode.
|
#
116192 |
|
11-Jun-2003 |
obrien |
Use __FBSDID().
|
#
115869 |
|
05-Jun-2003 |
rwatson |
Implement ffs_listextattr() by breaking out that logic and special-cased attribute name of "" from ffs_getextattr(). Invoking VOP_GETETATTR() with an empty name is now no longer supported; user application compatibility is provided by a system call level compatibility wrapper. We make sure to explicitly reject attempts to set an EA with the name "".
Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories
|
#
115588 |
|
01-Jun-2003 |
rwatson |
Return EOPNOTSUPP for attempted EA operations on VCHR vnodes in UFS2; if we permit them to occur, the kernel panics due to our performing EA operations using VOP_STRATEGY on the vnode. This went unnoticed previously because there are very for users of device nodes on UFS2 due to the introduction of devfs. However, this can come up with the Linux compat directories and its hard-coded dev nodes (which will need to go away as we move away from hard-coded device numbers). This can come up if you use EA-intensive features such as ACLs and MAC.
The proper fix is pretty complicated, but this band-aid would be an excellent MFC candidate for the release.
|
#
115474 |
|
31-May-2003 |
phk |
Remove unused local variables.
Found by: FlexeLint
|
#
115456 |
|
31-May-2003 |
phk |
The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to prevent deadlocks with vnode backed md(4) devices because md now uses a kthread to run the bio requests instead of doing it directly from the bio down path.
|
#
114599 |
|
03-May-2003 |
alc |
Lock the vm_object on entry to vm_object_vndeallocate().
|
#
114216 |
|
29-Apr-2003 |
kan |
Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h>
Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>
|
#
112694 |
|
26-Mar-2003 |
tegge |
Add support for reading directly from file to userland buffer when the O_DIRECT descriptor status flag is set and both offset and length is a multiple of the physical media sector size.
|
#
112181 |
|
13-Mar-2003 |
jeff |
- Remove a race between fsync like functions and flushbufqueues() by requiring locked bufs in vfs_bio_awrite(). Previously the buf could have been written out by fsync before we acquired the buf lock if it weren't for giant. The cluster_wbuild() handles this race properly but the single write at the end of vfs_bio_awrite() would not. - Modify flushbufqueues() so there is only one copy of the loop. Pass a parameter in that says whether or not we should sync bufs with deps. - Call flushbufqueues() a second time and then break if we couldn't find any bufs without deps.
|
#
111937 |
|
06-Mar-2003 |
alc |
Remove ENABLE_VFS_IOOPT. It is a long unfinished work-in-progress.
Discussed on: arch@
|
#
111463 |
|
25-Feb-2003 |
jeff |
- Add an interlock argument to BUF_LOCK and BUF_TIMELOCK. - Remove the buftimelock mutex and acquire the buf's interlock to protect these fields instead. - Hold the vnode interlock while locking bufs on the clean/dirty queues. This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another BUF_LOCK with a LK_TIMEFAIL to a single lock.
Reviewed by: arch, mckusick
|
#
111119 |
|
19-Feb-2003 |
imp |
Back out M_* changes, per decision of the TRB.
Approved by: trb
|
#
110584 |
|
09-Feb-2003 |
jeff |
- Cleanup unlocked accesses to buf flags by introducing a new b_vflag member that is protected by the vnode lock. - Move B_SCANNED into b_vflags and call it BV_SCANNED. - Create a vop_stdfsync() modeled after spec's sync. - Replace spec_fsync, msdos_fsync, and hpfs_fsync with the stdfsync and some fs specific processing. This gives all of these filesystems proper behavior wrt MNT_WAIT/NOWAIT and the use of the B_SCANNED flag. - Annotate the locking in buf.h
|
#
109623 |
|
21-Jan-2003 |
alfred |
Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
|
#
105422 |
|
18-Oct-2002 |
dillon |
Fix a file-rewrite performance case for UFS[2]. When rewriting portions of a file in chunks that are less then the filesystem block size, if the data is not already cached the system will perform a read-before-write. The problem is that it does this on a block-by-block basis, breaking up the I/Os and making clustering impossible for the writes. Programs such as INN using cyclic file buffers suffer greatly. This problem is only going to get worse as we use larger and larger filesystem block sizes.
The solution is to extend the sequential heuristic so UFS[2] can perform a far larger read and readahead when dealing with this case.
(note: maximum disk write bandwidth is 27MB/sec thru filesystem) (note: filesystem blocksize in test is 8K (1K frag)) dd if=/dev/zero of=test.dat bs=1k count=2m conv=notrunc
Before: (note half of these are reads) tty da0 da1 acd0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 76 14.21 598 8.30 0.00 0 0.00 0.00 0 0.00 0 0 7 1 92 0 76 14.09 813 11.19 0.00 0 0.00 0.00 0 0.00 0 0 9 5 86 0 76 14.28 821 11.45 0.00 0 0.00 0.00 0 0.00 0 0 8 1 91
After: (note half of these are reads) tty da0 da1 acd0 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 76 63.62 434 26.99 0.00 0 0.00 0.00 0 0.00 0 0 18 1 80 0 76 63.58 424 26.30 0.00 0 0.00 0.00 0 0.00 0 0 17 2 82 0 76 63.82 438 27.32 0.00 0 0.00 0.00 0 0.00 1 0 19 2 79
Reviewed by: mckusick Approved by: re X-MFC after: immediately (was heavily tested in -stable for 4 months)
|
#
105136 |
|
14-Oct-2002 |
mckusick |
When reading or writing the extended attributes of a special device or fifo in UFS2, the normal ufs_strategy routine needs to be used rather than the spec_strategy or fifo_strategy routine. Thus the ffsext_strategy routine is interposed in the ffs_vnops vectors for special devices and fifo's to pick off this special case. Otherwise it simply falls through to the usual spec_strategy or fifo_strategy routine.
Submitted by: Robert Watson <rwatson@FreeBSD.org> Sponsored by: DARPA & NAI Labs.
|
#
104346 |
|
02-Oct-2002 |
dd |
size_t is not a struct (fix mislabelling in a comment).
|
#
104094 |
|
28-Sep-2002 |
phk |
Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too.
Inspired by: FlexeLint warning #512
|
#
104051 |
|
27-Sep-2002 |
phk |
Use our mount-credential if we get a NOCRED when we try to write out EA space back to disk.
This is wrong in many ways, but not as wrong as a panic.
Pancied on: rwatson & jmallet Sponsored by: DARPA & NAI Labs.
|
#
103946 |
|
25-Sep-2002 |
jeff |
- Convert locks to use standard macros. - Lock access to the buflists. - Document broken locking. - Use vrefcnt().
|
#
102991 |
|
05-Sep-2002 |
phk |
Implement the VOP_OPENEXTATTR() and VOP_CLOSEEXTATTR() methods.
Use extattr_check_cred() to check access to EAs.
This is still a WIP.
Sponsored by: DARPA & NAI Labs.
|
#
102957 |
|
05-Sep-2002 |
bde |
Include <sys/malloc.h> instead of depending on namespace pollution 2 layers deep in <sys/proc.h> or <sys/vnode.h>.
Include <sys/vmmeter.h> instead of depending on namespace pollution in <sys/pcpu.h>.
Sorted includes as much as possible.
|
#
102608 |
|
30-Aug-2002 |
phk |
Correctly handle setting, getting and deleting EA's with zero length content.
Sponsored by: DARPA & NAI Labs.
|
#
102382 |
|
24-Aug-2002 |
alc |
o Retire vm_page_zero_fill() and vm_page_zero_fill_area(). Ever since pmap_zero_page() and pmap_zero_page_area() were modified to accept a struct vm_page * instead of a physical address, vm_page_zero_fill() and vm_page_zero_fill_area() have served no purpose.
|
#
102175 |
|
20-Aug-2002 |
phk |
Implement list of EA return functionality. Correctly delete EA's when the content length is set to zero.
Sponsored by: DARPA & NAI Labs.
|
#
102090 |
|
19-Aug-2002 |
phk |
First snapshot of UFS2 EA support.
Sponsored by: DARPA & NAI Labs.
|
#
101789 |
|
13-Aug-2002 |
phk |
Expand the arguments to ffs_ext{read,write}() to their component parts rather than use vop_{read,write}_args. Access to these functions will ultimately not be available through the "vop_{read,write}+IO_EXT" API but this functionality is retained for debugging purposes for now.
Sponsored by: DARPA & NAI Labs.
|
#
101780 |
|
13-Aug-2002 |
phk |
Unravel the UFS_EXTATTR incest between FFS and UFS: UFS_EXTATTR is an UFS only thing, and FFS should in principle not know if it is enabled or not.
This commit cleans ffs_vnops.c for such knowledge, but not ffs_vfsops.c
Sponsored by: DARPA and NAI Labs.
|
#
101720 |
|
12-Aug-2002 |
phk |
Stop pretending that the FFS file ufs_readwrite.c is a UFS file.
Instead of #including it, pull it into ffs_vnops.c and name things correctly.
Sponsored by: DARPA & NAI Labs.
|
#
101308 |
|
04-Aug-2002 |
jeff |
- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking.
Idea stolen from: BSD/OS
|
#
98542 |
|
21-Jun-2002 |
mckusick |
This commit adds basic support for the UFS2 filesystem. The UFS2 filesystem expands the inode to 256 bytes to make space for 64-bit block pointers. It also adds a file-creation time field, an ability to use jumbo blocks per inode to allow extent like pointer density, and space for extended attributes (up to twice the filesystem block size worth of attributes, e.g., on a 16K filesystem, there is space for 32K of attributes). UFS2 fully supports and runs existing UFS1 filesystems. New filesystems built using newfs can be built in either UFS1 or UFS2 format using the -O option. In this commit UFS1 is the default format, so if you want to build UFS2 format filesystems, you must specify -O 2. This default will be changed to UFS2 when UFS2 proves itself to be stable. In this commit the boot code for reading UFS2 filesystems is not compiled (see /sys/boot/common/ufsread.c) as there is insufficient space in the boot block. Once the size of the boot block is increased, this code can be defined.
Things to note: the definition of SBSIZE has changed to SBLOCKSIZE. The header file <ufs/ufs/dinode.h> must be included before <ufs/ffs/fs.h> so as to get the definitions of ufs2_daddr_t and ufs_lbn_t.
Still TODO: Verify that the first level bootstraps work for all the architectures. Convert the utility ffsinfo to understand UFS2 and test growfs. Add support for the extended attribute storage. Update soft updates to ensure integrity of extended attribute storage. Switch the current extended attribute interfaces to use the extended attribute storage. Add the extent like functionality (framework is there, but is currently never used).
Sponsored by: DARPA & NAI Labs. Reviewed by: Poul-Henning Kamp <phk@freebsd.org>
|
#
95974 |
|
03-May-2002 |
phk |
Name ufs_vop_[gs]etextattr() consistently with the rest of our VOPs and put then in the ufs_vnops where they belong, rather than in the ffs_vnops.
Ok'ed by: rwatson Sponsored by: DARPA & NAI Labs.
|
#
92728 |
|
19-Mar-2002 |
alfred |
Remove __P.
|
#
92250 |
|
13-Mar-2002 |
mckusick |
This corrects the first of two known deadlock conditions that come from the presence of a snapshot file.
|
#
83366 |
|
12-Sep-2001 |
julian |
KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
|
#
76167 |
|
01-May-2001 |
phk |
Implement vop_std{get|put}pages() and add them to the default vop[].
Un-copy&paste all the VOP_{GET|PUT}PAGES() functions which do nothing but the default.
|
#
76132 |
|
29-Apr-2001 |
phk |
VOP_BALLOC was never really a VOP in the first place, so convert it to UFS_BALLOC like the other "between UFS and FFS function interfaces".
|
#
76117 |
|
29-Apr-2001 |
grog |
Revert consequences of changes to mount.h, part 2.
Requested by: bde
|
#
75858 |
|
23-Apr-2001 |
grog |
Correct #includes to work with fixed sys/mount.h.
|
#
74433 |
|
19-Mar-2001 |
rwatson |
o Change options FFS_EXTATTR and options FFS_EXTATTR_AUTOSTART to options UFS_EXTATTR and UFS_EXTATTR_AUTOSTART respectively. This change reflects the fact that our EA support is implemented entirely at the UFS layer (modulo FFS start/stop/autostart hooks for mount and unmount events). This also better reflects the fact that [shortly] MFS will also support EAs, as well as possibly IFS.
o Consumers of the EA support in FFS are reminded that as a result, they must change kernel config files to reflect the new option names.
Obtained from: TrustedBSD Project
|
#
73942 |
|
07-Mar-2001 |
mckusick |
Fixes to track snapshot copy-on-write checking in the specinfo structure rather than assuming that the device vnode would reside in the FFS filesystem (which is obviously a broken assumption with the device filesystem).
|
#
72012 |
|
04-Feb-2001 |
phk |
Another round of the <sys/queue.h> FOREACH transmogriffer.
Created with: sed(1) Reviewed by: md5(1)
|
#
67106 |
|
14-Oct-2000 |
adrian |
Initial commit of IFS - a inode-namespaced FFS. Here is a short description:
How it works: --
Basically ifs is a copy of ffs, overriding some vfs/vnops. (Yes, hack.) I didn't see the need in duplicating all of sys/ufs/ffs to get this off the ground.
File creation is done through a special file - 'newfile' . When newfile is called, the system allocates and returns an inode. Note that newfile is done in a cloning fashion:
fd = open("newfile", O_CREAT|O_RDWR, 0644); fstat(fd, &st);
printf("new file is %d\n", (int)st.st_ino);
Once you have created a file, you can open() and unlink() it by its returned inode number retrieved from the stat call, ie:
fd = open("5", O_RDWR);
The creation permissions depend entirely if you have write access to the root directory of the filesystem.
To get the list of currently allocated inodes, VOP_READDIR has been added which returns a directory listing of those currently allocated.
--
What this entails:
* patching conf/files and conf/options to include IFS as a new compile option (and since ifs depends upon FFS, include the FFS routines)
* An entry in i386/conf/NOTES indicating IFS exists and where to go for an explanation
* Unstaticize a couple of routines in src/sys/ufs/ffs/ which the IFS routines require (ffs_mount() and ffs_reload())
* a new bunch of routines in src/sys/ufs/ifs/ which implement the IFS routines. IFS replaces some of the vfsops, and a handful of vnops - most notably are VFS_VGET(), VOP_LOOKUP(), VOP_UNLINK() and VOP_READDIR(). Any other directory operation is marked as invalid.
What this results in:
* an IFS partition's create permissions are controlled by the perm/ownership of the root mount point, just like a normal directory
* Each inode has perm and ownership too
* IFS does *NOT* mean an FFS partition can be opened per inode. This is a completely seperate filesystem here
* Softupdates doesn't work with IFS, and really I don't think it needs it. Besides, fsck's are FAST. (Try it :-)
* Inodes 0 and 1 aren't allocatable because they are special (dump/swap IIRC). Inode 2 isn't allocatable since UFS/FFS locks all inodes in the system against this particular inode, and unravelling THAT code isn't trivial. Therefore, useful inodes start at 3.
Enjoy, and feedback is definitely appreciated!
|
#
66886 |
|
09-Oct-2000 |
eivind |
Blow away the v_specmountpoint define, replacing it with what it was defined as (rdev->si_mountpoint)
|
#
66187 |
|
21-Sep-2000 |
rwatson |
o Permit UFS Extended Attributes to be associated with special devices and FIFOs.
Obtained from: TrustedBSD Project
|
#
62976 |
|
11-Jul-2000 |
mckusick |
Add snapshots to the fast filesystem. Most of the changes support the gating of system calls that cause modifications to the underlying filesystem. The gating can be enabled by any filesystem that needs to consistently suspend operations by adding the vop_stdgetwritemount to their set of vnops. Once gating is enabled, the function vfs_write_suspend stops all new write operations to a filesystem, allows any filesystem modifying system calls already in progress to complete, then sync's the filesystem to disk and returns. The function vfs_write_resume allows the suspended write operations to begin again. Gating is not added by default for all filesystems as for SMP systems it adds two extra locks to such critical kernel paths as the write system call. Thus, gating should only be added as needed.
Details on the use and current status of snapshots in FFS can be found in /sys/ufs/ffs/README.snapshot so for brevity and timelyness is not included here. Unless and until you create a snapshot file, these changes should have no effect on your system (famous last words).
|
#
61730 |
|
16-Jun-2000 |
phk |
Revert part of my bioops change which implemented panic(8).
|
#
61724 |
|
16-Jun-2000 |
phk |
Virtualizes & untangles the bioops operations vector.
Ref: Message-ID: <18317.961014572@critter.freebsd.dk> To: current@
|
#
60041 |
|
05-May-2000 |
phk |
Separate the struct bio related stuff out of <sys/buf.h> into <sys/bio.h>.
<sys/bio.h> is now a prerequisite for <sys/buf.h> but it shall not be made a nested include according to bdes teachings on the subject of nested includes.
Diskdrivers and similar stuff below specfs::strategy() should no longer need to include <sys/buf.> unless they need caching of data.
Still a few bogus uses of struct buf to track down.
Repocopy by: peter
|
#
59241 |
|
15-Apr-2000 |
rwatson |
Introduce extended attribute support for FFS, allowing arbitrary (name, value) pairs to be associated with inodes. This support is used for ACLs, MAC labels, and Capabilities in the TrustedBSD security extensions, which are currently under development.
In this implementation, attributes are backed to data vnodes in the style of the quota support in FFS. Support for FFS extended attributes may be enabled using the FFS_EXTATTR kernel option (disabled by default). Userland utilities and man pages will be committed in the next batch. VFS interfaces and man pages have been in the repo since 4.0-RELEASE and are unchanged.
o ufs/ufs/extattr.h: UFS-specific extattr defines o ufs/ufs/ufs_extattr.c: bulk of support routines o ufs/{ufs,ffs,mfs}/*.[ch]: hooks and extattr.h includes o contrib/softupdates/ffs_softdep.c: extattr.h includes o conf/options, conf/files, i386/conf/LINT: added FFS_EXTATTR
o coda/coda_vfsops.c: XXX required extattr.h due to ufsmount.h (This should not be the case, and will be fixed in a future commit)
Currently attributes are not supported in MFS. This will be fixed.
Reviewed by: adrian, bp, freebsd-fs, other unthanked souls Obtained from: TrustedBSD Project
|
#
55756 |
|
10-Jan-2000 |
phk |
Give vn_isdisk() a second argument where it can return a suitable errno.
Suggested by: bde
|
#
55697 |
|
09-Jan-2000 |
mckusick |
Several performance improvements for soft updates have been added: 1) Fastpath deletions. When a file is being deleted, check to see if it was so recently created that its inode has not yet been written to disk. If so, the delete can proceed to immediately free the inode. 2) Background writes: No file or block allocations can be done while the bitmap is being written to disk. To avoid these stalls, the bitmap is copied to another buffer which is written thus leaving the original available for futher allocations. 3) Link count tracking. Constantly track the difference in i_effnlink and i_nlink so that inodes that have had no change other than i_effnlink need not be written. 4) Identify buffers with rollback dependencies so that the buffer flushing daemon can choose to skip over them.
|
#
53577 |
|
22-Nov-1999 |
phk |
Convert various pieces of code to use vn_isdisk() rather than checking for vp->v_type == VBLK.
In ccd: we don't need to call VOP_GETATTR to find the type of a vnode.
Reviewed by: sos
|
#
52635 |
|
29-Oct-1999 |
phk |
useracc() the prequel:
Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs.
This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ|WRITE} rather than B_{READ|WRITE} as argument.
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
49535 |
|
08-Aug-1999 |
phk |
Decommision miscfs/specfs/specdev.h. Most of it goes into <sys/conf.h>, a few lines into <sys/vnode.h>.
Add a few fields to struct specinfo, paving the way for the fun part.
|
#
48225 |
|
26-Jun-1999 |
mckusick |
Convert buffer locking from using the B_BUSY and B_WANTED flags to using lockmgr locks. This commit should be functionally equivalent to the old semantics. That is, all buffer locking is done with LK_EXCLUSIVE requests. Changes to take advantage of LK_SHARED and LK_RECURSIVE will be done in future commits.
|
#
47995 |
|
18-Jun-1999 |
mckusick |
On our final pass through ffs_fsync, do all I/O synchronously so that we can find out if our flush is failing because of write errors. This change avoids a "flush failed" panic during unrecoverable disk errors.
|
#
47131 |
|
13-May-1999 |
mckusick |
Add a hook to ffs_fsync to allow soft updates to get first chance at doing a sync on the block device for the filesystem. That allows it to push the bitmap blocks before the inode blocks which greatly reduces the number of inode rollbacks that need to be done.
|
#
44391 |
|
02-Mar-1999 |
mckusick |
When fsync'ing a file on a filesystem using soft updates, we first try to write all the dirty blocks. If some of those blocks have dependencies, they will be remarked dirty when the I/O completes. On systems with really fast I/O systems, it is possible to get in an infinite loop trying to flush the buffers, because the I/O finishes before we can get all the dirty buffers off the v_dirtyblkhd list and into the I/O queue. (The previous algorithm looped over the v_dirtyblkhd list writing out buffers until the list emptied.) So, now we mark each buffer that we try to write so that we can distinguish the ones that are being remarked dirty from those that we have not yet tried to flush. Once we have tried to push every buffer once, we then push any associated metadata that is causing the remaining buffers to be redirtied.
Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
|
#
42374 |
|
07-Jan-1999 |
bde |
Don't pass unused unused timestamp args to UFS_UPDATE() or waste time initializing them. This almost finishes centralizing (in-core) timestamp updates in ufs_itimes().
|
#
40790 |
|
31-Oct-1998 |
peter |
Use TAILQ macros for clean/dirty block list processing. Set b_xflags rather than abusing the list next pointer with a magic number.
|
#
39623 |
|
24-Sep-1998 |
luoqi |
Eliminate a race in VOP_FSYNC() when softupdates is enabled. Submitted by: Kirk McKusick <mckusick@McKusick.COM> Two minor changes are also included, 1. Remove gratuitious checks for error return from vn_lock with LK_RETRY set, vn_lock should always succeed in these cases. 2. Back out change rev. 1.36->1.37, which unnecessarily makes async mount a little more unstable. It also keeps us in sync with other BSDs. Suggested by: Bruce Evans <bde@zeta.org.au>
|
#
38907 |
|
07-Sep-1998 |
bde |
Put the zombie ffs sysctl node in "notyet" state together with its few remaining children. Prepare it for MOUNT_UFS going away.
|
#
36863 |
|
10-Jun-1998 |
julian |
Back out John's changes 1.45 -> 1.46 Kirk confirms that the original semantic was what he wanted... (well, a very slight difference) May fix "dangling deps" panic with soft updates.
|
#
35955 |
|
11-May-1998 |
julian |
Add missing splx()
Submitted by: Luoqi Chen <luoqi@chen.ml.org>
|
#
34961 |
|
30-Mar-1998 |
phk |
Eradicate the variable "time" from the kernel, using various measures. "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part.
Most uses of time.tv_sec now uses the new variable time_second instead.
gettime() changed to getmicrotime(0.
Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it).
A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random.
Add a new nfs_curusec() function.
Mark a couple of bogosities involving the now disappeard time variable.
Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args.
Change profiling in ncr.c to use ticks instead of time. Resolution is the same.
Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences.
Reviewed by: bde
|
#
34924 |
|
28-Mar-1998 |
bde |
Moved some #includes from <sys/param.h> nearer to where they are actually used.
|
#
34734 |
|
21-Mar-1998 |
dyson |
Softdep_sync_metadata appears to expect that it is called at splbio, so make it so...
|
#
34696 |
|
19-Mar-1998 |
dyson |
Fix vfs_bio_awrite usage, and correct vtruncbuf usage.
|
#
34266 |
|
08-Mar-1998 |
julian |
Reviewed by: dyson@freebsd.org (john Dyson), dg@root.com (david greenman) Submitted by: Kirk McKusick (mcKusick@mckusick.com) Obtained from: WHistle development tree
|
#
33847 |
|
26-Feb-1998 |
msmith |
In the author's words:
These diffs implement the first stage of a VOP_{GET|PUT}PAGES pushdown for local media FS's.
See ffs_putpages in /sys/ufs/ufs/ufs_readwrite.c for implementation details for generic *_{get|put}pages for local media FS's. Support is trivial to add for any FS that formerly relied on the default behaviour of the vnode_pager in in EOPNOTSUPP cases (just copy the ffs_getpages() code for the FS in question's *_{get|put}pages).
Obviously, it would be better if each local media FS implemented a more optimal method, instead of calling an exported interface from the /sys/vm/vnode_pager.c, but this is a necessary first step in getting the FS's to a point where they can be supplied with better implementations on a case-by-case basis.
Obviously, the cd9660_putpages() can be rather trivial (since it is a read-only FS type 8-)).
A slight (temporary) modification is made to print a diagnostic message in the case where the underlying filesystem attempts to engage in the previous behaviour. Failure is likely to be ungraceful.
Submitted by: terry@freebsd.org (Terry Lambert)
|
#
33134 |
|
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
#
33108 |
|
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
#
32976 |
|
01-Feb-1998 |
dyson |
Back out recent laptop sync changes. They had significant errors.
|
#
32951 |
|
31-Jan-1998 |
dyson |
Support more intelligent sync operations for MNT_NOATIME. PR: kern/5577 Submitted by: Craig Leres <leres@ee.lbl.gov>
|
#
32286 |
|
06-Jan-1998 |
dyson |
Make our v_usecount vnode reference count work identically to the original BSD code. The association between the vnode and the vm_object no longer includes reference counts. The major difference is that vm_object's are no longer freed gratuitiously from the vnode, and so once an object is created for the vnode, it will last as long as the vnode does.
When a vnode object reference count is incremented, then the underlying vnode reference count is incremented also. The two "objects" are now more intimately related, and so the interactions are now much less complex.
When vnodes are now normally placed onto the free queue with an object still attached. The rundown of the object happens at vnode rundown time, and happens with exactly the same filesystem semantics of the original VFS code. There is absolutely no need for vnode_pager_uncache and other travesties like that anymore.
A side-effect of these changes is that SMP locking should be much simpler, the I/O copyin/copyout optimizations work, NFS should be more ponderable, and further work on layered filesystems should be less frustrating, because of the totally coherent management of the vnode objects and vnodes.
Please be careful with your system while running this code, but I would greatly appreciate feedback as soon a reasonably possible.
|
#
30780 |
|
27-Oct-1997 |
bde |
Removed unused #includes. The need for most of them went away with recent changes (docluster* and vfs improvements).
|
#
30492 |
|
16-Oct-1997 |
phk |
Another VFS cleanup "kilo commit"
1. Remove VOP_UPDATE, it is (also) an UFS/{FFS,LFS,EXT2FS,MFS} intereface function, and now lives in the ufsmount structure.
2. Remove VOP_SEEK, it was unused.
3. Add mode default vops:
VOP_ADVLOCK vop_einval VOP_CLOSE vop_null VOP_FSYNC vop_null VOP_IOCTL vop_enotty VOP_MMAP vop_einval VOP_OPEN vop_null VOP_PATHCONF vop_einval VOP_READLINK vop_einval VOP_REALLOCBLKS vop_eopnotsupp
And remove identical functionality from filesystems
4. Add vop_stdpathconf, which returns the canonical stuff. Use it in the filesystems. (XXX: It's probably wrong that specfs and fifofs sets this vop, shouldn't it come from the "host" filesystem, for instance ufs or cd9660 ?)
5. Try to make system wide VOP functions have vop_* names.
6. Initialize the um_* vectors in LFS.
(Recompile your LKMS!!!)
|
#
30474 |
|
16-Oct-1997 |
phk |
VFS mega cleanup commit (x/N)
1. Add new file "sys/kern/vfs_default.c" where default actions for VOPs go. Implement proper defaults for ABORTOP, BWRITE, LEASE, POLL, REVOKE and STRATEGY. Various stuff spread over the entire tree belongs here.
2. Change VOP_BLKATOFF to a normal function in cd9660.
3. Kill VOP_BLKATOFF, VOP_TRUNCATE, VOP_VFREE, VOP_VALLOC. These are private interface functions between UFS and the underlying storage manager layer (FFS/LFS/MFS/EXT2FS). The functions now live in struct ufsmount instead.
4. Remove a kludge of VOP_ functions in all filesystems, that did nothing but obscure the simplicity and break the expandability. If a filesystem doesn't implement VOP_FOO, it shouldn't have an entry for it in its vnops table. The system will try to DTRT if it is not implemented. There are still some cruft left, but the bulk of it is done.
5. Fix another VCALL in vfs_cache.c (thanks Bruce!)
|
#
30439 |
|
15-Oct-1997 |
phk |
vnops megacommit
1. Use the default function to access all the specfs operations. 2. Use the default function to access all the fifofs operations. 3. Use the default function to access all the ufs operations. 4. Fix VCALL usage in vfs_cache.c 5. Use VOCALL to access specfs functions in devfs_vnops.c 6. Staticize most of the spec and fifofs vnops functions. 7. Make UFS panic if it lacks bits of the underlying storage handling.
|
#
30434 |
|
15-Oct-1997 |
phk |
Hmm, realign the vnops into two columns.
|
#
30431 |
|
15-Oct-1997 |
phk |
Stylistic overhaul of vnops tables. 1. Remove comment stating the blatantly obvious. 2. Align in two columns. 3. Sort all but the default element alphabetically. 4. Remove XXX comments pointing out entries not needed.
|
#
30283 |
|
10-Oct-1997 |
phk |
Add type arg to ffs_mountfs and avoid examining v_tag to find out if MFS is getting a free ride.
Use generic ufs_reclaim().
|
#
29888 |
|
27-Sep-1997 |
kato |
Clustered read and write are switched at mount-option level.
1. Clustered I/O is switched by the MNT_NOCLUSTERR and MNT_NOCLUSTERW bits of the mnt_flag. The sysctl variables, vfs.foo.doclusterread and vfs.foo.doclusterwrite are deleted. Only mount option can control clustered I/O from userland. 2. When foofs_mount mounts block device, foofs_mount checks D_CLUSTERR and D_CLUSTERW bits of the d_flags member in the block device switch table. If D_NOCLUSTERR / D_NOCLUSTERW are set, MNT_NOCLUSTERR / MNT_NOCLUSTERW bits will be set. In this case, MNT_NOCLUSTERR and MNT_NOCLUSTERW cannot be cleared from userland. 3. Vnode driver disables both clustered read and write. 4. Union filesystem disables clutered write.
Reviewed by: bde
|
#
29362 |
|
14-Sep-1997 |
peter |
Convert select -> poll. Delete 'always succeed' select/poll handlers, replaced with generic call. Flag missing vnode op table entries.
|
#
29041 |
|
02-Sep-1997 |
bde |
Removed unused #includes.
|
#
28787 |
|
26-Aug-1997 |
phk |
Uncut&paste cache_lookup().
This unifies several times in theory indentical 50 lines of code.
The filesystems have a new method: vop_cachedlookup, which is the meat of the lookup, and use vfs_cache_lookup() for their vop_lookup method. vfs_cache_lookup() will check the namecache and pass on to the vop_cachedlookup method in case of a miss.
It's still the task of the individual filesystems to populate the namecache with cache_enter().
Filesystems that do not use the namecache will just provide the vop_lookup method as usual.
|
#
28701 |
|
25-Aug-1997 |
kato |
Renamed doclusterread/write to unique names (ffs_doclusterread/write), and staticize them. Move the #include of <sys/sysctl.h> to the top of the file.
Pointed out by: Bruce Evans <bde@zeta.org.au>
|
#
24101 |
|
22-Mar-1997 |
bde |
Fixed some invalid (non-atomic) accesses to `time', mostly ones of the form `tv = time'. Use a new function gettime(). The current version just forces atomicicity without fixing precision or efficiency bugs. Simplified some related valid accesses by using the central function.
|
#
23383 |
|
04-Mar-1997 |
bde |
Fixed connection of vfs.ffs node to the sysctl tree.
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22521 |
|
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
12911 |
|
17-Dec-1995 |
phk |
Staticize.
|
#
12662 |
|
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
#
12288 |
|
14-Nov-1995 |
phk |
Get rid of the last debug sysctl variables of the old style.
|
#
12158 |
|
09-Nov-1995 |
bde |
Introduced a type `vop_t' for vnode operation functions and used it 1138 times (:-() in casts and a few more times in declarations. This change is null for the i386.
The type has to be `typedef int vop_t(void *)' and not `typedef int vop_t()' because `gcc -Wstrict-prototypes' warns about the latter. Since vnode op functions are called with args of different (struct pointer) types, neither of these function types is any use for type checking of the arg, so it would be preferable not to use the complete function type, especially since using the complete type requires adding 1138 casts to avoid compiler warnings and another 40+ casts to reverse the function pointer conversions before calling the functions.
|
#
11701 |
|
23-Oct-1995 |
dyson |
Finalize GETPAGES layering scheme. Move the device GETPAGES interface into specfs code. No need at this point to modify the PUTPAGES stuff except in the layered-type (NULL/UNION) filesystems.
|
#
10998 |
|
25-Sep-1995 |
dyson |
Re-enable read clustering.
|
#
10949 |
|
22-Sep-1995 |
dg |
Shit! I changed the wrong doclusterread! ...Thanks to Steven Wallace and Poul-Henning for convincing me that I should look at my mistake! :-)
|
#
10946 |
|
21-Sep-1995 |
dg |
Disable file read clustering until the bug(s) in vfs_cluster.c are fixed. This should temporarily fix the sig 10/11 problems that people have been having for the past 3 weeks.
|
#
10578 |
|
06-Sep-1995 |
dyson |
Added indirect pointer for ffs_getpages, and added external declaration.
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
8692 |
|
21-May-1995 |
dg |
Changes to fix the following bugs:
1) Files weren't properly synced on filesystems other than UFS. In some cases, this lead to lost data. Most likely would be noticed on NFS. The fix is to make the VM page sync/object_clean general rather than in each filesystem. 2) Mixing regular and mmaped file I/O on NFS was very broken. It caused chunks of files to end up as zeroes rather than the intended contents. The fix was to fix several race conditions and to kludge up the "b_dirtyoff" and "b_dirtyend" that NFS relies upon - paying attention to page modifications that occurred via the mmapping.
Reviewed by: David Greenman Submitted by: John Dyson
|
#
7695 |
|
09-Apr-1995 |
dg |
Changes from John Dyson and myself:
Fixed remaining known bugs in the buffer IO and VM system.
vfs_bio.c: Fixed some race conditions and locking bugs. Improved performance by removing some (now) unnecessary code and fixing some broken logic. Fixed process accounting of # of FS outputs. Properly handle NFS interrupts (B_EINTR).
(various) Replaced calls to clrbuf() with calls to an optimized routine called vfs_bio_clrbuf().
(various FS sync) Sync out modified vnode_pager backed pages.
ffs_vnops.c: Do two passes: Sync out file data first, then indirect blocks.
vm_fault.c: Fixed deadly embrace caused by acquiring locks in the wrong order.
vnode_pager.c: Changed to use buffer I/O system for writing out modified pages. This should fix the problem with the modification date previous not getting updated. Also dramatically simplifies the code. Note that this is going to change in the future and be implemented via VOP_PUTPAGES().
vm_object.c: Fixed a pile of bugs related to cleaning (vnode) objects. The performance of vm_object_page_clean() is terrible when dealing with huge objects, but this will change when we implement a binary tree to keep the object pages sorted.
vm_pageout.c: Fixed broken clustering of pageouts. Fixed race conditions and other lockup style bugs in the scanning of pages. Improved performance.
|
#
5455 |
|
09-Jan-1995 |
dg |
These changes embody the support of the fully coherent merged VM buffer cache, much higher filesystem I/O performance, and much better paging performance. It represents the culmination of over 6 months of R&D.
The majority of the merged VM/cache work is by John Dyson.
The following highlights the most significant changes. Additionally, there are (mostly minor) changes to the various filesystem modules (nfs, msdosfs, etc) to support the new VM/buffer scheme.
vfs_bio.c: Significant rewrite of most of vfs_bio to support the merged VM buffer cache scheme. The scheme is almost fully compatible with the old filesystem interface. Significant improvement in the number of opportunities for write clustering.
vfs_cluster.c, vfs_subr.c Upgrade and performance enhancements in vfs layer code to support merged VM/buffer cache. Fixup of vfs_cluster to eliminate the bogus pagemove stuff.
vm_object.c: Yet more improvements in the collapse code. Elimination of some windows that can cause list corruption.
vm_pageout.c: Fixed it, it really works better now. Somehow in 2.0, some "enhancements" broke the code. This code has been reworked from the ground-up.
vm_fault.c, vm_page.c, pmap.c, vm_object.c Support for small-block filesystems with merged VM/buffer cache scheme.
pmap.c vm_map.c Dynamic kernel VM size, now we dont have to pre-allocate excessive numbers of kernel PTs.
vm_glue.c Much simpler and more effective swapping code. No more gratuitous swapping.
proc.h Fixed the problem that the p_lock flag was not being cleared on a fork.
swap_pager.c, vnode_pager.c Removal of old vfs_bio cruft to support the past pseudo-coherency. Now the code doesn't need it anymore.
machdep.c Changes to better support the parameter values for the merged VM/buffer cache scheme.
machdep.c, kern_exec.c, vm_glue.c Implemented a seperate submap for temporary exec string space and another one to contain process upages. This eliminates all map fragmentation problems that previously existed.
ffs_inode.c, ufs_inode.c, ufs_readwrite.c Changes for merged VM/buffer cache. Add "bypass" support for sneaking in on busy buffers.
Submitted by: John Dyson and David Greenman
|
#
3487 |
|
09-Oct-1994 |
phk |
Cosmetics. make gcc less noisy. Still some way to go here.
|
#
3396 |
|
06-Oct-1994 |
dg |
Use tsleep() rather than sleep so that 'ps' is more informative about the wait.
|
#
2979 |
|
22-Sep-1994 |
wollman |
More loadable VFS changes:
- Make a number of filesystems work again when they are statically compiled (blush)
- FIFOs are no longer optional; ``options FIFO'' removed from distributed config files.
|
#
2946 |
|
21-Sep-1994 |
wollman |
Implemented loadable VFS modules, and made most existing filesystems loadable. (NFS is a notable exception.)
|
#
1960 |
|
08-Aug-1994 |
dg |
Made lockf advisory locking code generic (rather than ufs specific), and use it in NFS. This is required both for diskless support and for POSIX compliance. Note: the support in NFS is only for the local node.
Submitted by: based on work originally done by Yuval Yurom
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1542 |
|
24-May-1994 |
rgrimes |
This commit was generated by cvs2svn to compensate for changes in r1541, which included commits to RCS files with non-trunk default branches.
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|