History log of /freebsd-10-stable/sys/modules/nfsserver/Makefile
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 249596 17-Apr-2013 ken

Move the NFS FHA (File Handle Affinity) code from sys/nfsserver to
sys/nfs, since it is now shared by the two NFS servers.

Suggested by: rmacklem
Sponsored by: Spectra Logic
MFC after: 2 weeks


# 249592 17-Apr-2013 ken

Revamp the old NFS server's File Handle Affinity (FHA) code so that
it will work with either the old or new server.

The FHA code keeps a cache of currently active file handles for
NFSv2 and v3 requests, so that read and write requests for the same
file are directed to the same group of threads (reads) or thread
(writes). It does not currently work for NFSv4 requests. They are
more complex, and will take more work to support.

This improves read-ahead performance, especially with ZFS, if the
FHA tuning parameters are configured appropriately. Without the
FHA code, concurrent reads that are part of a sequential read from
a file will be directed to separate NFS threads. This has the
effect of confusing the ZFS zfetch (prefetch) code and makes
sequential reads significantly slower with clients like Linux that
do a lot of prefetching.

The FHA code has also been updated to direct write requests to nearby
file offsets to the same thread in the same way it batches reads,
and the FHA code will now also send writes to multiple threads when
needed.

This improves sequential write performance in ZFS, because writes
to a file are now more ordered. Since NFS writes (generally
less than 64K) are smaller than the typical ZFS record size
(usually 128K), out of order NFS writes to the same block can
trigger a read in ZFS. Sending them down the same thread increases
the odds of their being in order.

In order for multiple write threads per file in the FHA code to be
useful, writes in the NFS server have been changed to use a LK_SHARED
vnode lock, and upgrade that to LK_EXCLUSIVE if the filesystem
doesn't allow multiple writers to a file at once. ZFS is currently
the only filesystem that allows multiple writers to a file, because
it has internal file range locking. This change does not affect the
NFSv4 code.

This improves random write performance to a single file in ZFS, since
we can now have multiple writers inside ZFS at one time.

I have changed the default tuning parameters to a 22 bit (4MB)
window size (from 256K) and unlimited commands per thread as a
result of my benchmarking with ZFS.

The FHA code has been updated to allow configuring the tuning
parameters from loader tunable variables in addition to sysctl
variables. The read offset window calculation has been slightly
modified as well. Instead of having separate bins, each file
handle has a rolling window of bin_shift size. This minimizes
glitches in throughput when shifting from one bin to another.

sys/conf/files:
Add nfs_fha_new.c and nfs_fha_old.c. Compile nfs_fha.c
when either the old or the new NFS server is built.

sys/fs/nfs/nfsport.h,
sys/fs/nfs/nfs_commonport.c:
Bring in changes from Rick Macklem to newnfs_realign that
allow it to operate in blocking (M_WAITOK) or non-blocking
(M_NOWAIT) mode.

sys/fs/nfs/nfs_commonsubs.c,
sys/fs/nfs/nfs_var.h:
Bring in a change from Rick Macklem to allow telling
nfsm_dissect() whether or not to wait for mallocs.

sys/fs/nfs/nfsm_subs.h:
Bring in changes from Rick Macklem to create a new
nfsm_dissect_nonblock() inline function and
NFSM_DISSECT_NONBLOCK() macro.

sys/fs/nfs/nfs_commonkrpc.c,
sys/fs/nfsclient/nfs_clkrpc.c:
Add the malloc wait flag to a newnfs_realign() call.

sys/fs/nfsserver/nfs_nfsdkrpc.c:
Setup the new NFS server's RPC thread pool so that it will
call the FHA code.

Add the malloc flag argument to newnfs_realign().

Unstaticize newnfs_nfsv3_procid[] so that we can use it in
the FHA code.

sys/fs/nfsserver/nfs_nfsdsocket.c:
In nfsrvd_dorpc(), add NFSPROC_WRITE to the list of RPC types
that use the LK_SHARED lock type.

sys/fs/nfsserver/nfs_nfsdport.c:
In nfsd_fhtovp(), if we're starting a write, check to see
whether the underlying filesystem supports shared writes.
If not, upgrade the lock type from LK_SHARED to LK_EXCLUSIVE.

sys/nfsserver/nfs_fha.c:
Remove all code that is specific to the NFS server
implementation. Anything that is server-specific is now
accessed through a callback supplied by that server's FHA
shim in the new softc.

There are now separate sysctls and tunables for the FHA
implementations for the old and new NFS servers. The new
NFS server has its tunables under vfs.nfsd.fha, the old
NFS server's tunables are under vfs.nfsrv.fha as before.

In fha_extract_info(), use callouts for all server-specific
code. Getting file handles and offsets is now done in the
individual server's shim module.

In fha_hash_entry_choose_thread(), change the way we decide
whether two reads are in proximity to each other.
Previously, the calculation was a simple shift operation to
see whether the offsets were in the same power of 2 bucket.
The issue was that there would be a bucket (and therefore
thread) transition, even if the reads were in close
proximity. When there is a thread transition, reads wind
up going somewhat out of order, and ZFS gets confused.

The new calculation simply tries to see whether the offsets
are within 1 << bin_shift of each other. If they are, the
reads will be sent to the same thread.

The effect of this change is that for sequential reads, if
the client doesn't exceed the max_reqs_per_nfsd parameter
and the bin_shift is set to a reasonable value (22, or
4MB works well in my tests), the reads in any sequential
stream will largely be confined to a single thread.

Change fha_assign() so that it takes a softc argument. It
is now called from the individual server's shim code, which
will pass in the softc.

Change fhe_stats_sysctl() so that it takes a softc
parameter. It is now called from the individual server's
shim code. Add the current offset to the list of things
printed out about each active thread.

Change the num_reads and num_writes counters in the
fha_hash_entry structure to 32-bit values, and rename them
num_rw and num_exclusive, respectively, to reflect their
changed usage.

Add an enable sysctl and tunable that allows the user to
disable the FHA code (when vfs.XXX.fha.enable = 0). This
is useful for before/after performance comparisons.

nfs_fha.h:
Move most structure definitions out of nfs_fha.c and into
the header file, so that the individual server shims can
see them.

Change the default bin_shift to 22 (4MB) instead of 18
(256K). Allow unlimited commands per thread.

sys/nfsserver/nfs_fha_old.c,
sys/nfsserver/nfs_fha_old.h,
sys/fs/nfsserver/nfs_fha_new.c,
sys/fs/nfsserver/nfs_fha_new.h:
Add shims for the old and new NFS servers to interface with
the FHA code, and callbacks for the

The shims contain all of the code and definitions that are
specific to the NFS servers.

They setup the server-specific callbacks and set the server
name for the sysctl and loader tunable variables.

sys/nfsserver/nfs_srvkrpc.c:
Configure the RPC code to call fhaold_assign() instead of
fha_assign().

sys/modules/nfsd/Makefile:
Add nfs_fha.c and nfs_fha_new.c.

sys/modules/nfsserver/Makefile:
Add nfs_fha_old.c.

Reviewed by: rmacklem
Sponsored by: Spectra Logic
MFC after: 2 weeks


# 203968 16-Feb-2010 marius

Factor out the code shared between NFS client and server into its own
module. With r203732 it became apparent that creating the sysctl nodes
twice causes at least a warning, however the whole code shouldn't be
present twice in the first place.

Discussed with: rmacklem


# 195202 30-Jun-2009 dfr

Remove the old kernel RPC implementation and the NFS_LEGACYRPC option.

Approved by: re


# 193588 06-Jun-2009 rwatson

Remove opt_mac.h generation for various kernel modules that no longer
require it.

Submitted by: pjd


# 185299 25-Nov-2008 dfr

Fix standalone module build by generating opt_kgssapi.h.

Submitted by: n_hibma


# 184716 06-Nov-2008 des

Unbreak NFS.

Pointy hat to: dfr


# 184588 03-Nov-2008 dfr

Implement support for RPCSEC_GSS authentication to both the NFS client
and server. This replaces the RPC implementation of the NFS client and
server with the newer RPC implementation originally developed
(actually ported from the userland sunrpc code) to support the NFS
Lock Manager. I have tested this code extensively and I believe it is
stable and that performance is at least equal to the legacy RPC
implementation.

The NFS code currently contains support for both the new RPC
implementation and the older legacy implementation inherited from the
original NFS codebase. The default is to use the new implementation -
add the NFS_LEGACYRPC option to fall back to the old code. When I
merge this support back to RELENG_7, I will probably change this so
that users have to 'opt in' to get the new code.

To use RPCSEC_GSS on either client or server, you must build a kernel
which includes the KGSSAPI option and the crypto device. On the
userland side, you must build at least a new libc, mountd, mount_nfs
and gssd. You must install new versions of /etc/rc.d/gssd and
/etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf.

As long as gssd is running, you should be able to mount an NFS
filesystem from a server that requires RPCSEC_GSS authentication. The
mount itself can happen without any kerberos credentials but all
access to the filesystem will be denied unless the accessing user has
a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There
is currently no support for situations where the ticket file is in a
different place, such as when the user logged in via SSH and has
delegated credentials from that login. This restriction is also
present in Solaris and Linux. In theory, we could improve this in
future, possibly using Brooks Davis' implementation of variant
symlinks.

Supporting RPCSEC_GSS on a server is nearly as simple. You must create
service creds for the server in the form 'nfs/<fqdn>@<REALM>' and
install them in /etc/krb5.keytab. The standard heimdal utility ktutil
makes this fairly easy. After the service creds have been created, you
can add a '-sec=krb5' option to /etc/exports and restart both mountd
and nfsd.

The only other difference an administrator should notice is that nfsd
doesn't fork to create service threads any more. In normal operation,
there will be two nfsd processes, one in userland waiting for TCP
connections and one in the kernel handling requests. The latter
process will create as many kthreads as required - these should be
visible via 'top -H'. The code has some support for varying the number
of service threads according to load but initially at least, nfsd uses
a fixed number of threads according to the value supplied to its '-n'
option.

Sponsored by: Isilon Systems
MFC after: 1 month


# 151350 14-Oct-2005 yar

Let modules use the kernel's opt_*.h files if built along with
the kernel by wrapping all targets for fake opt_*.h files in
.if defined(KERNBUILDDIR). Thus, such fake files won't be
created at all if modules are built with the kernel.

Some modules undergo cleanup like removing unused or unneeded
options or .h files, without which they wouldn't build this way
or the other.

Reviewed by: ru
Tested by: no binary changes in modules built alone
Tested on: i386 sparc64 amd64


# 106412 04-Nov-2002 rwatson

Permit MAC policies to instrument the access control decisions for
system accounting configuration and for nfsd server thread attach.
Policies might use this to protect the integrity or confidentiality
of accounting data, limit the ability to turn on or off accounting,
as well as to prevent inappropriately labeled threads from becoming nfs
server threads.

Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories


# 100134 15-Jul-2002 alfred

Add IPv6 support.

Submitted by: Jean-Luc Richier <Jean-Luc.Richier@imag.fr>


# 89260 11-Jan-2002 ru

Drop <bsd.man.mk> support from <bsd.kmod.mk>.

Not objected to by: -current


# 83651 18-Sep-2001 peter

Cleanup and split of nfs client and server code.
This builds on the top of several repo-copies.


# 75651 18-Apr-2001 alfred

NFS module now requires nfs_lock.c


# 71985 04-Feb-2001 peter

Zap some bad examples:
opt_foo.h:
touch opt_foo.h
.. is unnecessary - kmod.mk does this for us.


# 70711 06-Jan-2001 obrien

Use a consistent style and one much closer to the rest of /usr/src


# 60966 26-May-2000 peter

Use .include <bsd.kmod.mk> to get to ../../*/conf/kmod.mk instead of
encoding the relative path.


# 59951 04-May-2000 peter

Pull in sys/conf/kmod.mk, rather than /usr/share/mk/bsd.kmod.mk.
This means that the kernel can be totally self contained now and is not
dependent on the last buildworld to update /usr/share/mk. This might
also make it easier to build 5.x kernels on 4.0 boxes etc, assuming
gensetdefs and config(8) are updated.


# 54508 12-Dec-1999 peter

Remove a whole bunch of "CFLAGS+= -DFSNAME" cruft. It hasn't been
needed for ages, but keeps getting cut/pasted into new Makefiles.
(Once apon a time it was used to activate mount arguments in
<sys/mount.h>, but that was killed with extreme prejudice long ago)


# 54502 12-Dec-1999 peter

Bring these more into line with other modules that have .h files generated
on the fly.


# 53846 28-Nov-1999 bde

Removed special rules for building and cleaning device interface files
and empty options files. The rules are now generated automatically in
bsd.kmod.mk. Cleaned up related things ($S and ${CLEANFILES}).


# 52787 02-Nov-1999 green

Unbreak this build.


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 40440 16-Oct-1998 peter

Sample initial set of kld-ified modules. Not all have been completely
converted yet. These are more of a starting point. This is NOT connected
to the parent Makefile.

OK'ed by jkh (who is ever so patiently waiting)


# 37462 07-Jul-1998 bde

Finished previous fix - don't forget to add one dummy options header
to CLEANFILES.

Fixed lots of style bugs.


# 37341 02-Jul-1998 sos

Fix the N'th occurance of missed bits due to opt_???? mucking.

Doesn't anybody TEST code before committing....

This is the N+1'th time these laste couble of days...


# 37294 30-Jun-1998 jmg

add new opt_nfs.h to cleanfiles...


# 37291 30-Jun-1998 jmg

fix buildworld hopefully be3fore anyone complains...

NFS_*TIMO should possibly be converted to sysctl vars (jkh's suggestion),
but in some cases it looks like nfs keeps a copy of the value in a struct

hash sizes are already ifdef'd KERNEL, so there aren't userland inpact
from them...


# 33143 06-Feb-1998 eivind

Back out opt_diagnostic.h changes.


# 33105 04-Feb-1998 eivind

Make the LKMs handle DIAGNOSTIC as a new-style option.


# 32357 08-Jan-1998 eivind

Minor fixups after INET option change.


# 22982 22-Feb-1997 peter

Revert $FreeBSD$ back to $Id$


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 2998 22-Sep-1994 wollman

Create NFS LKM.