History log of /openbsd-current/sys/ufs/ufs/ufs_dirhash.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.43 09-Jan-2024 guenther

Delete support for FFS filesystems before the in-inode symlink
optimization. As observed by ali_farzanrad(at)riseup.net, support
for these was broken in the 5.5 release in early 2014 by the time_t
changes. No one noticed before now, so clearly this isn't something
we need to continue to support; rejecting in ffs_validate() is an
improvement.

Also: simplify DIRSIZ(), drop OLDDIRFMT and NEWDIRFMT, tests of
fs_maxsymlinklen against zero, #ifdef tests of FS_44INODEFMT, and
remove support for newfs -O0, last used in 2016.

ok miod@


Revision tags: OPENBSD_6_5_BASE OPENBSD_6_6_BASE OPENBSD_6_7_BASE OPENBSD_6_8_BASE OPENBSD_6_9_BASE OPENBSD_7_0_BASE OPENBSD_7_1_BASE OPENBSD_7_2_BASE OPENBSD_7_3_BASE OPENBSD_7_4_BASE
# 1.42 15-Mar-2019 kevlo

Remove FBSDID.

ok deraadt@


# 1.41 06-Mar-2019 tedu

increase dirhash mem a bit since very tiny machines are less common.
perhaps not enough for everyone, but we'll see what happens.


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.40 26-Oct-2017 guenther

Revert 2006-4-29Z23:09:45 commit that switched from rwlocks to mutexes.
Use of mutexes there is a WITNESS violation.

ok visa@ mpi@


Revision tags: OPENBSD_6_2_BASE
# 1.39 19-Apr-2017 dhill

Add sizes to free()

ok deraadt@ visa@


Revision tags: OPENBSD_6_1_BASE
# 1.38 15-Sep-2016 dlg

all pools have their ipl set via pool_setipl, so fold it into pool_init.

the ioff argument to pool_init() is unused and has been for many
years, so this replaces it with an ipl argument. because the ipl
will be set on init we no longer need pool_setipl.

most of these changes have been done with coccinelle using the spatch
below. cocci sucks at formatting code though, so i fixed that by hand.

the manpage and subr_pool.c bits i did myself.

ok tedu@ jmatthew@

@ipl@
expression pp;
expression ipl;
expression s, a, o, f, m, p;
@@
-pool_init(pp, s, a, o, f, m, p);
-pool_setipl(pp, ipl);
+pool_init(pp, s, a, ipl, f, m, p);


Revision tags: OPENBSD_6_0_BASE
# 1.37 19-Jun-2016 dlg

add pool_setipl on all pools.

ok tedu@ visa@


# 1.36 03-Apr-2016 natano

Remove sparc special-casing from ufsdirhash_init(). This is not required
anymore since the kernel VM space increase work done in sparc about one
year ago.

from Miod Vallat; thanks!
ok tobiasu


# 1.35 23-Mar-2016 natano

remove vax handling
ok millert


# 1.34 27-Feb-2016 natano

Move mnt_maxsymlink from struct mount to struct ufsmount.

The concept of differentiating between "short" and "long" symlinks is
specific to ufs/, so it shouldn't creep into the generic fs layer.
Inspired by a similar commit to NetBSD.

While there replace all references to mnt_maxsymlinklen in ufs/ext2fs
with EXT2_MAXSYMLINKLEN, which is the constant max short symlink len for
ext2fs. This allows to get rid of some (mnt_maxsymlinklen == 0) checks
there, which is always false for ext2fs.

input and ok stefan@
ok millert@


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE
# 1.33 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.32 23-Dec-2014 tedu

change pool_init allocator to NULL and pass PR_WAITOK in flags as a sign
that these don't need to support interrupts


# 1.31 04-Dec-2014 tedu

use siphash for dirhash. ok deraadt dlg


# 1.30 14-Sep-2014 jsg

remove uneeded proc.h includes
ok mpi@ kspillner@


Revision tags: OPENBSD_5_6_BASE
# 1.29 14-Jul-2014 beck

revert free checks in here. this seems to be a bit too agressive at the
moment and now is not the time. hitting these in here causes chaos.
We need to do these, but at a better time than right after a hackathon
and before release.
ok guenther@


# 1.28 13-Jul-2014 tedu

pass correct sizes to free()


# 1.27 13-Jul-2014 tedu

use mallocarray


# 1.26 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


Revision tags: OPENBSD_5_5_BASE
# 1.25 12-Dec-2013 tedu

bcmp -> memcmp


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.24 16-Aug-2012 tedu

remove pool hiwat call. hiwat is less useful than it used to be.
less greedy pools are nicer pools.


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.23 28-Jun-2011 tedu

change two function defs with () to (void)


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.22 25-Apr-2010 tedu

dirhash cna cope with real locks (and has before), enable mutexes here.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.21 20-Aug-2009 jasper

- reference correct variable in comment

ok tedu@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.20 12-Jun-2008 deraadt

Bring biomem diff back into the tree after the nfs_bio.c fix went in.
ok thib beck art


# 1.19 11-Jun-2008 deraadt

back out biomem diff since it is not right yet. Doing very large
file copies to nfsv2 causes the system to eventually peg the console.
On the console ^T indicates that the load is increasing rapidly, ddb
indicates many calls to getbuf, there is some very slow nfs traffic
making none (or extremely slow) progress. Eventually some machines
seize up entirely.


# 1.18 10-Jun-2008 beck

Buffer cache revamp

1) remove multiple size queues, introduced as a stopgap.
2) decouple pages containing data from their mappings
3) only keep buffers mapped when they actually have to be mapped
(right now, this is when buffers are B_BUSY)
4) New functions to make a buffer busy, and release the busy flag
(buf_acquire and buf_release)
5) Move high/low water marks and statistics counters into a structure
6) Add a sysctl to retrieve buffer cache statistics

Tested in several variants and beat upon by bob and art for a year. run
accidentally on henning's nfs server for a few months...

ok deraadt@, krw@, art@ - who promises to be around to deal with any fallout


Revision tags: OPENBSD_4_3_BASE
# 1.17 08-Oct-2007 krw

Initialize dh_blkfree with zero's.

ok beck@


# 1.16 05-Oct-2007 thib

MALLOC/FREE -> malloc/free + M_ZERO.

As a side effect, this probably fixes PR5596, if the allocation
of dh_hash succeeds and the dh_blkfree fails, we jump into the
fail case, but we haven't initialized dh_hash properly, that is
filling the array with memory from the dirhash pool, but the
!= NULL check holds, since the memory hasn't been zeroed and
so we start pool_put()'ing, causing the crash in PR5596.

PR5596 debugging by pedro.

ok art@, krw@


Revision tags: OPENBSD_4_2_BASE
# 1.15 23-Jul-2007 kettenis

Since __sparc__ gets defined on sparc64 too, add a !defined (__sparc64__)
to the condition that protects CPU_ISSUN4OR4C. While we currently define
that macro on sparc64 too, we won't in the near future.

ok miod@


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.14 21-Jun-2006 mickey

do not wait in pool_get() here as we can recover from no memory; tedu@ pedro@ ok; tested by many


# 1.13 30-May-2006 mickey

do not deref a ptr before NULL check; pedro@ ok


# 1.12 29-Apr-2006 tedu

no need for using rwlocks in dirhash. i was confused about the purpose
freebsd's mutexes served here, but they are only for smp protection.
the code is careful not to block and needs no rwlocks.
ok pedro and an assortment of testers


Revision tags: OPENBSD_3_9_BASE
# 1.11 28-Dec-2005 pedro

Use the DIP macros to uniformly access fields from UFS1 and UFS2 dinodes.
No functional change, okay tedu@.


# 1.10 13-Oct-2005 mickey

pump up the high water mark on the dirhash pool to avoid page allocation throttling; pedro@ ok


Revision tags: OPENBSD_3_8_BASE
# 1.9 03-Jul-2005 drahn

Extended Attributes was a piece to get to ACLs, however ACLs have not
been worked on, so EA is pointless. Also the code is not enabled
in GENERIC so it is not being tested or maintained.


Revision tags: OPENBSD_3_6_BASE OPENBSD_3_7_BASE
# 1.8 21-Jul-2004 art

I was wrong. The assymetry created by the proc argument to rw_enter_write
is horrible and doesn't add anything.

Remove it.
XXX - the fdplock macro will need a separate cleanup.

niklas@ markus@ ok


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.7 16-Mar-2004 tedu

re-add fbsd id so i can track this thing


# 1.6 16-Feb-2004 tedu

branches: 1.6.2;
sync MACRO names with freebsd.


# 1.5 02-Feb-2004 tedu

gluk points out i missed tags


# 1.4 07-Jan-2004 tedu

sysctls for dirhash variables. with a hint from miod. ok deraadt


# 1.3 07-Jan-2004 tedu

remove FreeBSD ifdef


# 1.2 28-Dec-2003 mickey

do not use MALLOC on variable sized allocations


# 1.1 28-Dec-2003 tedu

add ian dowse's dirhash code from freebsd.
by building a hash table for large directories, lookups and deletions
become about constant time. this is an excellent improvement for dirs with
10k or more files.
some more cleanup to come, but the code works.
enabled with option UFS_DIRHASH
testing brad millert otto


Revision tags: OPENBSD_6_5_BASE
# 1.42 15-Mar-2019 kevlo

Remove FBSDID.

ok deraadt@


# 1.41 06-Mar-2019 tedu

increase dirhash mem a bit since very tiny machines are less common.
perhaps not enough for everyone, but we'll see what happens.


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.40 26-Oct-2017 guenther

Revert 2006-4-29Z23:09:45 commit that switched from rwlocks to mutexes.
Use of mutexes there is a WITNESS violation.

ok visa@ mpi@


Revision tags: OPENBSD_6_2_BASE
# 1.39 19-Apr-2017 dhill

Add sizes to free()

ok deraadt@ visa@


Revision tags: OPENBSD_6_1_BASE
# 1.38 15-Sep-2016 dlg

all pools have their ipl set via pool_setipl, so fold it into pool_init.

the ioff argument to pool_init() is unused and has been for many
years, so this replaces it with an ipl argument. because the ipl
will be set on init we no longer need pool_setipl.

most of these changes have been done with coccinelle using the spatch
below. cocci sucks at formatting code though, so i fixed that by hand.

the manpage and subr_pool.c bits i did myself.

ok tedu@ jmatthew@

@ipl@
expression pp;
expression ipl;
expression s, a, o, f, m, p;
@@
-pool_init(pp, s, a, o, f, m, p);
-pool_setipl(pp, ipl);
+pool_init(pp, s, a, ipl, f, m, p);


Revision tags: OPENBSD_6_0_BASE
# 1.37 19-Jun-2016 dlg

add pool_setipl on all pools.

ok tedu@ visa@


# 1.36 03-Apr-2016 natano

Remove sparc special-casing from ufsdirhash_init(). This is not required
anymore since the kernel VM space increase work done in sparc about one
year ago.

from Miod Vallat; thanks!
ok tobiasu


# 1.35 23-Mar-2016 natano

remove vax handling
ok millert


# 1.34 27-Feb-2016 natano

Move mnt_maxsymlink from struct mount to struct ufsmount.

The concept of differentiating between "short" and "long" symlinks is
specific to ufs/, so it shouldn't creep into the generic fs layer.
Inspired by a similar commit to NetBSD.

While there replace all references to mnt_maxsymlinklen in ufs/ext2fs
with EXT2_MAXSYMLINKLEN, which is the constant max short symlink len for
ext2fs. This allows to get rid of some (mnt_maxsymlinklen == 0) checks
there, which is always false for ext2fs.

input and ok stefan@
ok millert@


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE
# 1.33 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.32 23-Dec-2014 tedu

change pool_init allocator to NULL and pass PR_WAITOK in flags as a sign
that these don't need to support interrupts


# 1.31 04-Dec-2014 tedu

use siphash for dirhash. ok deraadt dlg


# 1.30 14-Sep-2014 jsg

remove uneeded proc.h includes
ok mpi@ kspillner@


Revision tags: OPENBSD_5_6_BASE
# 1.29 14-Jul-2014 beck

revert free checks in here. this seems to be a bit too agressive at the
moment and now is not the time. hitting these in here causes chaos.
We need to do these, but at a better time than right after a hackathon
and before release.
ok guenther@


# 1.28 13-Jul-2014 tedu

pass correct sizes to free()


# 1.27 13-Jul-2014 tedu

use mallocarray


# 1.26 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


Revision tags: OPENBSD_5_5_BASE
# 1.25 12-Dec-2013 tedu

bcmp -> memcmp


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.24 16-Aug-2012 tedu

remove pool hiwat call. hiwat is less useful than it used to be.
less greedy pools are nicer pools.


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.23 28-Jun-2011 tedu

change two function defs with () to (void)


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.22 25-Apr-2010 tedu

dirhash cna cope with real locks (and has before), enable mutexes here.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.21 20-Aug-2009 jasper

- reference correct variable in comment

ok tedu@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.20 12-Jun-2008 deraadt

Bring biomem diff back into the tree after the nfs_bio.c fix went in.
ok thib beck art


# 1.19 11-Jun-2008 deraadt

back out biomem diff since it is not right yet. Doing very large
file copies to nfsv2 causes the system to eventually peg the console.
On the console ^T indicates that the load is increasing rapidly, ddb
indicates many calls to getbuf, there is some very slow nfs traffic
making none (or extremely slow) progress. Eventually some machines
seize up entirely.


# 1.18 10-Jun-2008 beck

Buffer cache revamp

1) remove multiple size queues, introduced as a stopgap.
2) decouple pages containing data from their mappings
3) only keep buffers mapped when they actually have to be mapped
(right now, this is when buffers are B_BUSY)
4) New functions to make a buffer busy, and release the busy flag
(buf_acquire and buf_release)
5) Move high/low water marks and statistics counters into a structure
6) Add a sysctl to retrieve buffer cache statistics

Tested in several variants and beat upon by bob and art for a year. run
accidentally on henning's nfs server for a few months...

ok deraadt@, krw@, art@ - who promises to be around to deal with any fallout


Revision tags: OPENBSD_4_3_BASE
# 1.17 08-Oct-2007 krw

Initialize dh_blkfree with zero's.

ok beck@


# 1.16 05-Oct-2007 thib

MALLOC/FREE -> malloc/free + M_ZERO.

As a side effect, this probably fixes PR5596, if the allocation
of dh_hash succeeds and the dh_blkfree fails, we jump into the
fail case, but we haven't initialized dh_hash properly, that is
filling the array with memory from the dirhash pool, but the
!= NULL check holds, since the memory hasn't been zeroed and
so we start pool_put()'ing, causing the crash in PR5596.

PR5596 debugging by pedro.

ok art@, krw@


Revision tags: OPENBSD_4_2_BASE
# 1.15 23-Jul-2007 kettenis

Since __sparc__ gets defined on sparc64 too, add a !defined (__sparc64__)
to the condition that protects CPU_ISSUN4OR4C. While we currently define
that macro on sparc64 too, we won't in the near future.

ok miod@


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.14 21-Jun-2006 mickey

do not wait in pool_get() here as we can recover from no memory; tedu@ pedro@ ok; tested by many


# 1.13 30-May-2006 mickey

do not deref a ptr before NULL check; pedro@ ok


# 1.12 29-Apr-2006 tedu

no need for using rwlocks in dirhash. i was confused about the purpose
freebsd's mutexes served here, but they are only for smp protection.
the code is careful not to block and needs no rwlocks.
ok pedro and an assortment of testers


Revision tags: OPENBSD_3_9_BASE
# 1.11 28-Dec-2005 pedro

Use the DIP macros to uniformly access fields from UFS1 and UFS2 dinodes.
No functional change, okay tedu@.


# 1.10 13-Oct-2005 mickey

pump up the high water mark on the dirhash pool to avoid page allocation throttling; pedro@ ok


Revision tags: OPENBSD_3_8_BASE
# 1.9 03-Jul-2005 drahn

Extended Attributes was a piece to get to ACLs, however ACLs have not
been worked on, so EA is pointless. Also the code is not enabled
in GENERIC so it is not being tested or maintained.


Revision tags: OPENBSD_3_6_BASE OPENBSD_3_7_BASE
# 1.8 21-Jul-2004 art

I was wrong. The assymetry created by the proc argument to rw_enter_write
is horrible and doesn't add anything.

Remove it.
XXX - the fdplock macro will need a separate cleanup.

niklas@ markus@ ok


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.7 16-Mar-2004 tedu

re-add fbsd id so i can track this thing


# 1.6 16-Feb-2004 tedu

branches: 1.6.2;
sync MACRO names with freebsd.


# 1.5 02-Feb-2004 tedu

gluk points out i missed tags


# 1.4 07-Jan-2004 tedu

sysctls for dirhash variables. with a hint from miod. ok deraadt


# 1.3 07-Jan-2004 tedu

remove FreeBSD ifdef


# 1.2 28-Dec-2003 mickey

do not use MALLOC on variable sized allocations


# 1.1 28-Dec-2003 tedu

add ian dowse's dirhash code from freebsd.
by building a hash table for large directories, lookups and deletions
become about constant time. this is an excellent improvement for dirs with
10k or more files.
some more cleanup to come, but the code works.
enabled with option UFS_DIRHASH
testing brad millert otto


# 1.41 06-Mar-2019 tedu

increase dirhash mem a bit since very tiny machines are less common.
perhaps not enough for everyone, but we'll see what happens.


Revision tags: OPENBSD_6_3_BASE OPENBSD_6_4_BASE
# 1.40 26-Oct-2017 guenther

Revert 2006-4-29Z23:09:45 commit that switched from rwlocks to mutexes.
Use of mutexes there is a WITNESS violation.

ok visa@ mpi@


Revision tags: OPENBSD_6_2_BASE
# 1.39 19-Apr-2017 dhill

Add sizes to free()

ok deraadt@ visa@


Revision tags: OPENBSD_6_1_BASE
# 1.38 15-Sep-2016 dlg

all pools have their ipl set via pool_setipl, so fold it into pool_init.

the ioff argument to pool_init() is unused and has been for many
years, so this replaces it with an ipl argument. because the ipl
will be set on init we no longer need pool_setipl.

most of these changes have been done with coccinelle using the spatch
below. cocci sucks at formatting code though, so i fixed that by hand.

the manpage and subr_pool.c bits i did myself.

ok tedu@ jmatthew@

@ipl@
expression pp;
expression ipl;
expression s, a, o, f, m, p;
@@
-pool_init(pp, s, a, o, f, m, p);
-pool_setipl(pp, ipl);
+pool_init(pp, s, a, ipl, f, m, p);


Revision tags: OPENBSD_6_0_BASE
# 1.37 19-Jun-2016 dlg

add pool_setipl on all pools.

ok tedu@ visa@


# 1.36 03-Apr-2016 natano

Remove sparc special-casing from ufsdirhash_init(). This is not required
anymore since the kernel VM space increase work done in sparc about one
year ago.

from Miod Vallat; thanks!
ok tobiasu


# 1.35 23-Mar-2016 natano

remove vax handling
ok millert


# 1.34 27-Feb-2016 natano

Move mnt_maxsymlink from struct mount to struct ufsmount.

The concept of differentiating between "short" and "long" symlinks is
specific to ufs/, so it shouldn't creep into the generic fs layer.
Inspired by a similar commit to NetBSD.

While there replace all references to mnt_maxsymlinklen in ufs/ext2fs
with EXT2_MAXSYMLINKLEN, which is the constant max short symlink len for
ext2fs. This allows to get rid of some (mnt_maxsymlinklen == 0) checks
there, which is always false for ext2fs.

input and ok stefan@
ok millert@


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE
# 1.33 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.32 23-Dec-2014 tedu

change pool_init allocator to NULL and pass PR_WAITOK in flags as a sign
that these don't need to support interrupts


# 1.31 04-Dec-2014 tedu

use siphash for dirhash. ok deraadt dlg


# 1.30 14-Sep-2014 jsg

remove uneeded proc.h includes
ok mpi@ kspillner@


Revision tags: OPENBSD_5_6_BASE
# 1.29 14-Jul-2014 beck

revert free checks in here. this seems to be a bit too agressive at the
moment and now is not the time. hitting these in here causes chaos.
We need to do these, but at a better time than right after a hackathon
and before release.
ok guenther@


# 1.28 13-Jul-2014 tedu

pass correct sizes to free()


# 1.27 13-Jul-2014 tedu

use mallocarray


# 1.26 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


Revision tags: OPENBSD_5_5_BASE
# 1.25 12-Dec-2013 tedu

bcmp -> memcmp


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.24 16-Aug-2012 tedu

remove pool hiwat call. hiwat is less useful than it used to be.
less greedy pools are nicer pools.


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.23 28-Jun-2011 tedu

change two function defs with () to (void)


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.22 25-Apr-2010 tedu

dirhash cna cope with real locks (and has before), enable mutexes here.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.21 20-Aug-2009 jasper

- reference correct variable in comment

ok tedu@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.20 12-Jun-2008 deraadt

Bring biomem diff back into the tree after the nfs_bio.c fix went in.
ok thib beck art


# 1.19 11-Jun-2008 deraadt

back out biomem diff since it is not right yet. Doing very large
file copies to nfsv2 causes the system to eventually peg the console.
On the console ^T indicates that the load is increasing rapidly, ddb
indicates many calls to getbuf, there is some very slow nfs traffic
making none (or extremely slow) progress. Eventually some machines
seize up entirely.


# 1.18 10-Jun-2008 beck

Buffer cache revamp

1) remove multiple size queues, introduced as a stopgap.
2) decouple pages containing data from their mappings
3) only keep buffers mapped when they actually have to be mapped
(right now, this is when buffers are B_BUSY)
4) New functions to make a buffer busy, and release the busy flag
(buf_acquire and buf_release)
5) Move high/low water marks and statistics counters into a structure
6) Add a sysctl to retrieve buffer cache statistics

Tested in several variants and beat upon by bob and art for a year. run
accidentally on henning's nfs server for a few months...

ok deraadt@, krw@, art@ - who promises to be around to deal with any fallout


Revision tags: OPENBSD_4_3_BASE
# 1.17 08-Oct-2007 krw

Initialize dh_blkfree with zero's.

ok beck@


# 1.16 05-Oct-2007 thib

MALLOC/FREE -> malloc/free + M_ZERO.

As a side effect, this probably fixes PR5596, if the allocation
of dh_hash succeeds and the dh_blkfree fails, we jump into the
fail case, but we haven't initialized dh_hash properly, that is
filling the array with memory from the dirhash pool, but the
!= NULL check holds, since the memory hasn't been zeroed and
so we start pool_put()'ing, causing the crash in PR5596.

PR5596 debugging by pedro.

ok art@, krw@


Revision tags: OPENBSD_4_2_BASE
# 1.15 23-Jul-2007 kettenis

Since __sparc__ gets defined on sparc64 too, add a !defined (__sparc64__)
to the condition that protects CPU_ISSUN4OR4C. While we currently define
that macro on sparc64 too, we won't in the near future.

ok miod@


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.14 21-Jun-2006 mickey

do not wait in pool_get() here as we can recover from no memory; tedu@ pedro@ ok; tested by many


# 1.13 30-May-2006 mickey

do not deref a ptr before NULL check; pedro@ ok


# 1.12 29-Apr-2006 tedu

no need for using rwlocks in dirhash. i was confused about the purpose
freebsd's mutexes served here, but they are only for smp protection.
the code is careful not to block and needs no rwlocks.
ok pedro and an assortment of testers


Revision tags: OPENBSD_3_9_BASE
# 1.11 28-Dec-2005 pedro

Use the DIP macros to uniformly access fields from UFS1 and UFS2 dinodes.
No functional change, okay tedu@.


# 1.10 13-Oct-2005 mickey

pump up the high water mark on the dirhash pool to avoid page allocation throttling; pedro@ ok


Revision tags: OPENBSD_3_8_BASE
# 1.9 03-Jul-2005 drahn

Extended Attributes was a piece to get to ACLs, however ACLs have not
been worked on, so EA is pointless. Also the code is not enabled
in GENERIC so it is not being tested or maintained.


Revision tags: OPENBSD_3_6_BASE OPENBSD_3_7_BASE
# 1.8 21-Jul-2004 art

I was wrong. The assymetry created by the proc argument to rw_enter_write
is horrible and doesn't add anything.

Remove it.
XXX - the fdplock macro will need a separate cleanup.

niklas@ markus@ ok


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.7 16-Mar-2004 tedu

re-add fbsd id so i can track this thing


# 1.6 16-Feb-2004 tedu

branches: 1.6.2;
sync MACRO names with freebsd.


# 1.5 02-Feb-2004 tedu

gluk points out i missed tags


# 1.4 07-Jan-2004 tedu

sysctls for dirhash variables. with a hint from miod. ok deraadt


# 1.3 07-Jan-2004 tedu

remove FreeBSD ifdef


# 1.2 28-Dec-2003 mickey

do not use MALLOC on variable sized allocations


# 1.1 28-Dec-2003 tedu

add ian dowse's dirhash code from freebsd.
by building a hash table for large directories, lookups and deletions
become about constant time. this is an excellent improvement for dirs with
10k or more files.
some more cleanup to come, but the code works.
enabled with option UFS_DIRHASH
testing brad millert otto


# 1.40 26-Oct-2017 guenther

Revert 2006-4-29Z23:09:45 commit that switched from rwlocks to mutexes.
Use of mutexes there is a WITNESS violation.

ok visa@ mpi@


Revision tags: OPENBSD_6_2_BASE
# 1.39 19-Apr-2017 dhill

Add sizes to free()

ok deraadt@ visa@


Revision tags: OPENBSD_6_1_BASE
# 1.38 15-Sep-2016 dlg

all pools have their ipl set via pool_setipl, so fold it into pool_init.

the ioff argument to pool_init() is unused and has been for many
years, so this replaces it with an ipl argument. because the ipl
will be set on init we no longer need pool_setipl.

most of these changes have been done with coccinelle using the spatch
below. cocci sucks at formatting code though, so i fixed that by hand.

the manpage and subr_pool.c bits i did myself.

ok tedu@ jmatthew@

@ipl@
expression pp;
expression ipl;
expression s, a, o, f, m, p;
@@
-pool_init(pp, s, a, o, f, m, p);
-pool_setipl(pp, ipl);
+pool_init(pp, s, a, ipl, f, m, p);


Revision tags: OPENBSD_6_0_BASE
# 1.37 19-Jun-2016 dlg

add pool_setipl on all pools.

ok tedu@ visa@


# 1.36 03-Apr-2016 natano

Remove sparc special-casing from ufsdirhash_init(). This is not required
anymore since the kernel VM space increase work done in sparc about one
year ago.

from Miod Vallat; thanks!
ok tobiasu


# 1.35 23-Mar-2016 natano

remove vax handling
ok millert


# 1.34 27-Feb-2016 natano

Move mnt_maxsymlink from struct mount to struct ufsmount.

The concept of differentiating between "short" and "long" symlinks is
specific to ufs/, so it shouldn't creep into the generic fs layer.
Inspired by a similar commit to NetBSD.

While there replace all references to mnt_maxsymlinklen in ufs/ext2fs
with EXT2_MAXSYMLINKLEN, which is the constant max short symlink len for
ext2fs. This allows to get rid of some (mnt_maxsymlinklen == 0) checks
there, which is always false for ext2fs.

input and ok stefan@
ok millert@


Revision tags: OPENBSD_5_8_BASE OPENBSD_5_9_BASE
# 1.33 14-Mar-2015 jsg

Remove some includes include-what-you-use claims don't
have any direct symbols used. Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@


Revision tags: OPENBSD_5_7_BASE
# 1.32 23-Dec-2014 tedu

change pool_init allocator to NULL and pass PR_WAITOK in flags as a sign
that these don't need to support interrupts


# 1.31 04-Dec-2014 tedu

use siphash for dirhash. ok deraadt dlg


# 1.30 14-Sep-2014 jsg

remove uneeded proc.h includes
ok mpi@ kspillner@


Revision tags: OPENBSD_5_6_BASE
# 1.29 14-Jul-2014 beck

revert free checks in here. this seems to be a bit too agressive at the
moment and now is not the time. hitting these in here causes chaos.
We need to do these, but at a better time than right after a hackathon
and before release.
ok guenther@


# 1.28 13-Jul-2014 tedu

pass correct sizes to free()


# 1.27 13-Jul-2014 tedu

use mallocarray


# 1.26 12-Jul-2014 tedu

add a size argument to free. will be used soon, but for now default to 0.
after discussions with beck deraadt kettenis.


Revision tags: OPENBSD_5_5_BASE
# 1.25 12-Dec-2013 tedu

bcmp -> memcmp


Revision tags: OPENBSD_5_3_BASE OPENBSD_5_4_BASE
# 1.24 16-Aug-2012 tedu

remove pool hiwat call. hiwat is less useful than it used to be.
less greedy pools are nicer pools.


Revision tags: OPENBSD_5_0_BASE OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.23 28-Jun-2011 tedu

change two function defs with () to (void)


Revision tags: OPENBSD_4_8_BASE OPENBSD_4_9_BASE
# 1.22 25-Apr-2010 tedu

dirhash cna cope with real locks (and has before), enable mutexes here.
ok deraadt


Revision tags: OPENBSD_4_7_BASE
# 1.21 20-Aug-2009 jasper

- reference correct variable in comment

ok tedu@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE OPENBSD_4_6_BASE
# 1.20 12-Jun-2008 deraadt

Bring biomem diff back into the tree after the nfs_bio.c fix went in.
ok thib beck art


# 1.19 11-Jun-2008 deraadt

back out biomem diff since it is not right yet. Doing very large
file copies to nfsv2 causes the system to eventually peg the console.
On the console ^T indicates that the load is increasing rapidly, ddb
indicates many calls to getbuf, there is some very slow nfs traffic
making none (or extremely slow) progress. Eventually some machines
seize up entirely.


# 1.18 10-Jun-2008 beck

Buffer cache revamp

1) remove multiple size queues, introduced as a stopgap.
2) decouple pages containing data from their mappings
3) only keep buffers mapped when they actually have to be mapped
(right now, this is when buffers are B_BUSY)
4) New functions to make a buffer busy, and release the busy flag
(buf_acquire and buf_release)
5) Move high/low water marks and statistics counters into a structure
6) Add a sysctl to retrieve buffer cache statistics

Tested in several variants and beat upon by bob and art for a year. run
accidentally on henning's nfs server for a few months...

ok deraadt@, krw@, art@ - who promises to be around to deal with any fallout


Revision tags: OPENBSD_4_3_BASE
# 1.17 08-Oct-2007 krw

Initialize dh_blkfree with zero's.

ok beck@


# 1.16 05-Oct-2007 thib

MALLOC/FREE -> malloc/free + M_ZERO.

As a side effect, this probably fixes PR5596, if the allocation
of dh_hash succeeds and the dh_blkfree fails, we jump into the
fail case, but we haven't initialized dh_hash properly, that is
filling the array with memory from the dirhash pool, but the
!= NULL check holds, since the memory hasn't been zeroed and
so we start pool_put()'ing, causing the crash in PR5596.

PR5596 debugging by pedro.

ok art@, krw@


Revision tags: OPENBSD_4_2_BASE
# 1.15 23-Jul-2007 kettenis

Since __sparc__ gets defined on sparc64 too, add a !defined (__sparc64__)
to the condition that protects CPU_ISSUN4OR4C. While we currently define
that macro on sparc64 too, we won't in the near future.

ok miod@


Revision tags: OPENBSD_4_0_BASE OPENBSD_4_1_BASE
# 1.14 21-Jun-2006 mickey

do not wait in pool_get() here as we can recover from no memory; tedu@ pedro@ ok; tested by many


# 1.13 30-May-2006 mickey

do not deref a ptr before NULL check; pedro@ ok


# 1.12 29-Apr-2006 tedu

no need for using rwlocks in dirhash. i was confused about the purpose
freebsd's mutexes served here, but they are only for smp protection.
the code is careful not to block and needs no rwlocks.
ok pedro and an assortment of testers


Revision tags: OPENBSD_3_9_BASE
# 1.11 28-Dec-2005 pedro

Use the DIP macros to uniformly access fields from UFS1 and UFS2 dinodes.
No functional change, okay tedu@.


# 1.10 13-Oct-2005 mickey

pump up the high water mark on the dirhash pool to avoid page allocation throttling; pedro@ ok


Revision tags: OPENBSD_3_8_BASE
# 1.9 03-Jul-2005 drahn

Extended Attributes was a piece to get to ACLs, however ACLs have not
been worked on, so EA is pointless. Also the code is not enabled
in GENERIC so it is not being tested or maintained.


Revision tags: OPENBSD_3_6_BASE OPENBSD_3_7_BASE
# 1.8 21-Jul-2004 art

I was wrong. The assymetry created by the proc argument to rw_enter_write
is horrible and doesn't add anything.

Remove it.
XXX - the fdplock macro will need a separate cleanup.

niklas@ markus@ ok


Revision tags: OPENBSD_3_5_BASE SMP_SYNC_A SMP_SYNC_B
# 1.7 16-Mar-2004 tedu

re-add fbsd id so i can track this thing


# 1.6 16-Feb-2004 tedu

branches: 1.6.2;
sync MACRO names with freebsd.


# 1.5 02-Feb-2004 tedu

gluk points out i missed tags


# 1.4 07-Jan-2004 tedu

sysctls for dirhash variables. with a hint from miod. ok deraadt


# 1.3 07-Jan-2004 tedu

remove FreeBSD ifdef


# 1.2 28-Dec-2003 mickey

do not use MALLOC on variable sized allocations


# 1.1 28-Dec-2003 tedu

add ian dowse's dirhash code from freebsd.
by building a hash table for large directories, lookups and deletions
become about constant time. this is an excellent improvement for dirs with
10k or more files.
some more cleanup to come, but the code works.
enabled with option UFS_DIRHASH
testing brad millert otto