History log of /freebsd-10-stable/sys/fs/nfsserver/nfs_nfsdstate.c
Revision Date Author Comments
# 358035 17-Feb-2020 rmacklem

MFC: r357149
Fix a crash in the NFSv4 server.

The PR reported a crash that occurred when a file was removed while
client(s) were actively doing lock operations on it.
Since nfsvno_getvp() will return NULL when the file does not exist,
the bug was obvious and easy to fix via this patch. It is a little
surprising that this wasn't found sooner, but I guess the above
case rarely occurs.

PR: 242768


# 346779 27-Apr-2019 rmacklem

MFC: r346191
Add support for INET6 addresses to the kernel code that dumps open/lock state.

PR#223036 reported that INET6 callback addresses were not printed by
nfsdumpstate(8). This kernel patch adds INET6 addresses to the dump structure,
so that nfsdumpstate(8) can print them out, post-r346190.


# 338132 21-Aug-2018 rmacklem

MFC: r336839
Modify the NFSv4.1 server so that it allows ReclaimComplete as done by ESXi 6.7.

I believe that a ReclaimComplete with rca_one_fs == TRUE is only
to be used after a file system has been transferred to a different
file server. However, RFC5661 is somewhat vague w.r.t. this and
the ESXi 6.7 client does both a ReclaimComplete with rca_one_fs == TRUE
and one with ReclaimComplete with rca_one_fs == FALSE.
Therefore, just ignore the rca_one_fs == TRUE operation and return
NFS_OK without doing anything instead of replying NFS4ERR_NOTSUPP.
This allows the ESXi 6.7 NFSv4.1 client to do a mount.
After discussion on the NFSv4 IETF working group mailing list, doing this
along with setting a flag to note that a ReclaimComplete with rca_one_fs TRUE
was an appropriate way to handle this.
The flag that indicates that a ReclaimComplete with rca_one_fs == TRUE was
done may be used to disable replies of NFS4ERR_GRACE for non-reclaim
state operations in a future commit.

This patch along with r332790, r334492 and r336357 allow ESXi 6.7 NFSv4.1 mounts
work ok. ESX 6.5 NFSv4.1 mounts do not work well, due to what I believe are
violations of RFC-5661 and should not be used.


# 337058 01-Aug-2018 rmacklem

MFC: r336357
Modify the reasons for not issuing a delegation in the NFSv4.1 server.

The ESXi NFSv4.1 client will generate warning messages when the reason for
not issuing a delegation is two. Two refers to a resource limit and I do
not see why it would be considered invalid. However it probably was not the
best choice of reason for not issuing a delegation.
This patch changes the reasons used to ones that the ESXi client doesn't
complain about. This change does not affect the FreeBSD client and does
not appear to affect behaviour of the Linux NFSv4.1 client.
RFC5661 defines these "reasons" but does not give any guidance w.r.t. which
ones are more appropriate to return to a client.


# 336846 28-Jul-2018 rmacklem

MFC: r334492
Add the BindConnectiontoSession operation to the NFSv4.1 server.

Under some fairly unusual circumstances, the Linux NFSv4.1 client is
doing a BindConnectiontoSession operation for TCP connections.
It is also used by the ESXi6.5 NFSv4.1 client.
This patch adds this operation to the NFSv4.1 server.

PR: 226493


# 336518 19-Jul-2018 rmacklem

MFC: r333766
Add a missing nfsrv_freesession() call for an unlikely failure case.

Since NFSv4.1 clients normally create a single session which supports
both fore and back channels, it is unlikely that a callback will fail
due to a lack of a back channel.
However, if this failure occurred, the session wasn't being dereferenced
and would never be free'd.
Found by inspection during pNFS server development.


# 336422 17-Jul-2018 rmacklem

MFC: r333645
End grace for the NFSv4 server if all mounts do ReclaimComplete.

The NFSv4 protocol requires that the server only allow reclaim of state
and not issue any new open/lock state for a grace period after booting.
The NFSv4.0 protocol required this grace period to be greater than the
lease duration (over 2minutes). For NFSv4.1, the client tells the server
that it has done reclaiming state by doing a ReclaimComplete operation.
If all NFSv4 clients are NFSv4.1, the grace period can end once all the
clients have done ReclaimComplete, shortening the time period considerably.
This patch does this. If there are any NFSv4.0 mounts, the grace period
will still be over 2minutes.
This change is only an optimization and does not affect correct operation.


# 336234 12-Jul-2018 rmacklem

MFC: r333579
The NFSv4.1 server should return NFSERR_BACKCHANBUSY instead of NFS_OK.

When an NFSv4.1 session is busy due to a callback being in progress,
nfsrv_freesession() should return NFSERR_BACKCHANBUSY instead of NFS_OK.
The only effect this has is that the DestroySession operation will report
the failure for this case and this probably has little or no effect on a
client. Spotted by inspection and no failures related to this have been
reported.


# 336179 10-Jul-2018 rmacklem

MFC: r333508
Add support for the TestStateID operation to the NFSv4.1 server.

The Linux client now uses the TestStateID operation, so this patch adds
support for it to the NFSv4.1 server. The FreeBSD client never uses this
operation, so it should not be affected.


# 334741 06-Jun-2018 rmacklem

MFC: r333580
Fix a slow leak of session structures in the NFSv4.1 server.

For a fairly rare case of a client doing an ExchangeID after a hard reboot,
the old confirmed clientid still exists, but some clients use a new
co_verifier. For this case, the server was not freeing up the sessions on
the old confirmed clientid.
This patch fixes this case. It also adds two LIST_INIT() macros, which are
actually no-ops, since the structure is malloc()d with M_ZERO so the pointer
is already set to NULL.
It should have minimal impact, since the only way I could exercise this
code path was by doing a hard power cycle (pulling the plus) on a machine
running Linux with a NFSv4.1 mount on the server.
Originally spotted during testing of the ESXi 6.5 client.

PR: 228497


# 334699 06-Jun-2018 rmacklem

MFC: r334396
Strengthen locking for the NFSv4.1 server DestroySession operation.

If a client did a DestroySession on a session while it was still in use,
the server might try to use the session structure after it is free'd.
I think the client has violated RFC5661 if it does this, but this patch
makes DestroySession block all other nfsd threads so no thread could
be using the session when it is free'd. After the DestroySession, nfsd
threads will not be able to find the session. The patch also adds a check
for nd_sessionid being set, although if that was not the case it would have
been all 0s and unlikely to have a false match.
This might fix the crashes described in PR#228497 for the FreeNAS server.

PR: 228497


# 327008 19-Dec-2017 rmacklem

MFC: r326544
Avoid the overhead of acquiring a lock in nfsrv_checkgetattr() when
there are no write delegations issued.

manu@ reported on the freebsd-current@ mailing list that there was
a significant performance hit in nfsrv_checkgetattr() caused by
the acquisition/release of a state lock, even when there were no
write delegations issued.
This patch add a count of outstanding issued write delegations to the
NFSv4 server. This count allows nfsrv_checkgetattr() to return without
acquiring any lock when the count is 0, avoiding the performance hit
for the case where no write delegations are issued.


# 325448 05-Nov-2017 rmacklem

MFC: r324639
Fix the client IP address reported by nfsdumpstate for 64bit arch and NFSv4.1.

The client IP address was not being reported for some NFSv4 mounts by
nfsdumpstate. Upon investigation, two problems were found for mounts
using IPv4. One was that the code (originally written and tested on i386)
assumed that a "u_long" was a "uint32_t" and would exactly store an
IPv4 host address. Not correct for 64bit arches.
Also, for NFSv4.1 mounts, the field was not being filled in. This was
basically correct, because NFSv4.1 does not use a callback address.
However, it meant that nfsdumpstate could not report the client IP addr.
This patch should fix both of these issues.
For IPv6, the address will still not be reported. The original NFSv4 RFC
only specified IPv4 callback addresses. I think this has changed and, if so,
a future commit to fix reporting of IPv6 addresses will be needed.


# 317986 08-May-2017 rmacklem

MFC: r317382
Allow use of a write open stateid for reading in the NFSv4 server.

The NFSv4 RFCs give a server the option of allowing the use of an open
stateid for write access to be used for a Read operation.
This patch enables this by default and adds a sysctl to disable it,
for anyone who does not want this capability.
Allowing this is particularily useful for a pNFS Data Server (DS), since
they are not permitted to allow the use of special stateids.
Discovered during recent testing of the pNFS server under development.


# 310303 19-Dec-2016 rmacklem

MFC: r309566
Fix the NFSv4.1 server for Open reclaim after a reboot.

The NFSv4.1 server failed to update the nfs-stablerestart file for
a client when the client was issued its first Open. As such, recovery
of Opens after a server reboot failed with NFSERR_NOGRACE.
This patch fixes this.
It also changes the code so that it malloc()'s the 1024 byte array
instead of allocating it on the kernel stack for both NFSv4.0 and NFSv4.1.
Note that this bug only affected NFSv4.1 and only when clients attempted
to reclaim Opens after a server reboot.


# 308241 02-Nov-2016 rmacklem

MFC: r307694
A problem w.r.t. interoperation between the FreeBSD NFSv4.1 server with
delegations enabled and the Linux NFSv4.1 client was reported in
reviews.freebsd.org/D7891.
I believe that the FreeBSD server behaviour conforms to the RFC and that
the Linux client has a bug. Therefore, I do not think the proposed patch
is appropriate. When nfsrv_writedelegifpos is non-zero, the FreeBSD
server will issue a write delegation for a read open if possible.
The Linux client then erroneously assumes that the credentials used for
the read open can write the file.
This patch reverses the default value for nfsrv_writedelegifpos to 0 so
that the default behaviour is Linux compatible and adds a sysctl that can
be used to set nfsrv_writedelegifpos.

This change should only affect users that are mounting a FreeBSD server
with delegations enabled (they are not enabled by default) with a Linux
NFSv4.1 client mount.


# 306663 03-Oct-2016 rmacklem

Revert r306659 since the userland changes won't merge and this would
break the build.


# 306659 03-Oct-2016 rmacklem

MFC: r304026
Update the nfsstats structure to include the changes needed by
the patch in D1626 plus changes so that it includes counts for
NFSv4.1 (and the draft of NFSv4.2).
Also, make all the counts uint64_t and add a vers field at the
beginning, so that future revisions can easily be implemented.
There is code in place to handle the old vesion of the nfsstats
structure for backwards binary compatibility.

Subsequent commits will update nfsstat(8) to use the new fields.


# 299222 07-May-2016 rmacklem

MFC: r298495
Fix a LOR in the NFSv4.1 server.

The ordering of acquisition of the state and session mutexes was
reversed in two cases executed when an NFSv4.1 client created/freed
a session. Since clients will typically do this only when mounting
and dismounting, the likelyhood of causing a deadlock was low but possible.
This can only occur for NFSv4.1 mounts, since the others do not
use sessions.
This was detected while testing the pNFS server/client where the
client crashed during dismounting.
The patch also reorders the unlocks, although that isn't necessary
for correct operation.


# 291869 05-Dec-2015 rmacklem

MFC: r291150
When the nfsd threads are terminated, the NFSv4 server state
(opens, locks, etc) is retained, which I believe is correct behaviour.
However, for NFSv4.1, the server also retained a reference to the xprt
(RPC transport socket structure) for the backchannel. This caused
svcpool_destroy() to not call SVC_DESTROY() for the xprt and allowed
a socket upcall to occur after the mutexes in the svcpool were destroyed,
causing a crash.
This patch fixes the code so that the backchannel xprt structure is
dereferenced just before svcpool_destroy() is called, so the code
does do an SVC_DESTROY() on the xprt, which shuts down the socket upcall.


# 287267 28-Aug-2015 rmacklem

MFC: r286790
For the case where an NFSv4.1 ExchangeID operation has the client identifier
that already has a confirmed ClientID, the nfsrv_setclient() function would
not fill in the clientidp being returned. As such, the value of ClientID
returned would be whatever garbage was on the stack.
This patch fixes the problem by filling in these fields.


# 286164 01-Aug-2015 rmacklem

MFC: r286046
This patch fixes a problem where, if the NFSv4 server has a previous
unconfirmed clientid structure for the same client on the last hash list,
this old entry would not be removed/deleted. I do not think this bug would have
caused serious problems, since the new entry would have been before the old one
on the list. This old entry would have eventually been scavenged/removed.


# 284318 12-Jun-2015 rmacklem

Add TUNABLE_INT() macros so that the tunables MFC'd from
head as r284216 are set via /boot/loader.conf in stable/10.
This is a direct commit to stable/10 because TUNABLE_INT()
is deprecated in head.


# 284216 10-Jun-2015 rmacklem

MFC: r283635
Make the size of the hash tables used by the NFSv4 server tunable.
No appreciable change in performance was observed after increasing
the sizes of these tables and then testing with a single client.
However, there was an email that indicated high CPU overheads for
a heavily loaded NFSv4 and it is hoped that increasing the sizes
of the hash tables via these tunables might help.
The tables remain the same size by default.


# 276437 30-Dec-2014 rmacklem

MFC: r276193
A deadlock in the NFSv4 server with vfs.nfsd.enable_locallocks=1
was reported via email. This was caused by a LOR between the
sleep lock used to serialize the local locking (nfsrv_locklf())
and locking the vnode. I believe this patch fixes the problem
by delaying relocking of the vnode until the sleep lock is
unlocked (nfsrv_unlocklf()). To avoid nfsvno_advlock() having the side
effect of unlocking the vnode, unlocking the vnode was moved to before
the functions that call nfsvno_advlock().
It shouldn't affect the execution of the default case where
vfs.nfsd.enable_locallocks=0.


# 269398 01-Aug-2014 rmacklem

MFC: r268115
Merge the NFSv4.1 server code in projects/nfsv4.1-server over
into head. The code is not believed to have any effect
on the semantics of non-NFSv4.1 server behaviour.
It is a rather large merge, but I am hoping that there will
not be any regressions for the NFS server.


# 265621 07-May-2014 rmacklem

MFC: r264845
Remove an unnecessary level of indirection for an argument.
This simplifies the code and should avoid the clang sparc
port from generating an abort() call.


# 260159 01-Jan-2014 rmacklem

MFC: r259854
The NFSv4 server would call VOP_SETATTR() with a shared locked vnode
when a Getattr for a file is done by a client other than the one that
holds the file's delegation. This would only happen when delegations
are enabled and the problem is fixed by this patch.


# 287267 28-Aug-2015 rmacklem

MFC: r286790
For the case where an NFSv4.1 ExchangeID operation has the client identifier
that already has a confirmed ClientID, the nfsrv_setclient() function would
not fill in the clientidp being returned. As such, the value of ClientID
returned would be whatever garbage was on the stack.
This patch fixes the problem by filling in these fields.


# 286164 01-Aug-2015 rmacklem

MFC: r286046
This patch fixes a problem where, if the NFSv4 server has a previous
unconfirmed clientid structure for the same client on the last hash list,
this old entry would not be removed/deleted. I do not think this bug would have
caused serious problems, since the new entry would have been before the old one
on the list. This old entry would have eventually been scavenged/removed.


# 284318 12-Jun-2015 rmacklem

Add TUNABLE_INT() macros so that the tunables MFC'd from
head as r284216 are set via /boot/loader.conf in stable/10.
This is a direct commit to stable/10 because TUNABLE_INT()
is deprecated in head.


# 284216 10-Jun-2015 rmacklem

MFC: r283635
Make the size of the hash tables used by the NFSv4 server tunable.
No appreciable change in performance was observed after increasing
the sizes of these tables and then testing with a single client.
However, there was an email that indicated high CPU overheads for
a heavily loaded NFSv4 and it is hoped that increasing the sizes
of the hash tables via these tunables might help.
The tables remain the same size by default.


# 276437 30-Dec-2014 rmacklem

MFC: r276193
A deadlock in the NFSv4 server with vfs.nfsd.enable_locallocks=1
was reported via email. This was caused by a LOR between the
sleep lock used to serialize the local locking (nfsrv_locklf())
and locking the vnode. I believe this patch fixes the problem
by delaying relocking of the vnode until the sleep lock is
unlocked (nfsrv_unlocklf()). To avoid nfsvno_advlock() having the side
effect of unlocking the vnode, unlocking the vnode was moved to before
the functions that call nfsvno_advlock().
It shouldn't affect the execution of the default case where
vfs.nfsd.enable_locallocks=0.


# 269398 01-Aug-2014 rmacklem

MFC: r268115
Merge the NFSv4.1 server code in projects/nfsv4.1-server over
into head. The code is not believed to have any effect
on the semantics of non-NFSv4.1 server behaviour.
It is a rather large merge, but I am hoping that there will
not be any regressions for the NFS server.


# 265621 07-May-2014 rmacklem

MFC: r264845
Remove an unnecessary level of indirection for an argument.
This simplifies the code and should avoid the clang sparc
port from generating an abort() call.


# 260159 01-Jan-2014 rmacklem

MFC: r259854
The NFSv4 server would call VOP_SETATTR() with a shared locked vnode
when a Getattr for a file is done by a client other than the one that
holds the file's delegation. This would only happen when delegations
are enabled and the problem is fixed by this patch.