Cross Reference: /freebsd-10-stable/sys/fs/nfsserver/nfs

History log of /freebsd-10-stable/sys/fs/nfsserver/nfs_nfsdstate.c
Revision	Date	Author	Comments
# 358035	17-Feb-2020	rmacklem	MFC: r357149 Fix a crash in the NFSv4 server. The PR reported a crash that occurred when a file was removed while client(s) were actively doing lock operations on it. Since nfsvno_getvp() will return NULL when the file does not exist, the bug was obvious and easy to fix via this patch. It is a little surprising that this wasn't found sooner, but I guess the above case rarely occurs. PR: 242768
# 346779	27-Apr-2019	rmacklem	MFC: r346191 Add support for INET6 addresses to the kernel code that dumps open/lock state. PR#223036 reported that INET6 callback addresses were not printed by nfsdumpstate(8). This kernel patch adds INET6 addresses to the dump structure, so that nfsdumpstate(8) can print them out, post-r346190.
# 338132	21-Aug-2018	rmacklem	MFC: r336839 Modify the NFSv4.1 server so that it allows ReclaimComplete as done by ESXi 6.7. I believe that a ReclaimComplete with rca_one_fs == TRUE is only to be used after a file system has been transferred to a different file server. However, RFC5661 is somewhat vague w.r.t. this and the ESXi 6.7 client does both a ReclaimComplete with rca_one_fs == TRUE and one with ReclaimComplete with rca_one_fs == FALSE. Therefore, just ignore the rca_one_fs == TRUE operation and return NFS_OK without doing anything instead of replying NFS4ERR_NOTSUPP. This allows the ESXi 6.7 NFSv4.1 client to do a mount. After discussion on the NFSv4 IETF working group mailing list, doing this along with setting a flag to note that a ReclaimComplete with rca_one_fs TRUE was an appropriate way to handle this. The flag that indicates that a ReclaimComplete with rca_one_fs == TRUE was done may be used to disable replies of NFS4ERR_GRACE for non-reclaim state operations in a future commit. This patch along with r332790, r334492 and r336357 allow ESXi 6.7 NFSv4.1 mounts work ok. ESX 6.5 NFSv4.1 mounts do not work well, due to what I believe are violations of RFC-5661 and should not be used.
# 337058	01-Aug-2018	rmacklem	MFC: r336357 Modify the reasons for not issuing a delegation in the NFSv4.1 server. The ESXi NFSv4.1 client will generate warning messages when the reason for not issuing a delegation is two. Two refers to a resource limit and I do not see why it would be considered invalid. However it probably was not the best choice of reason for not issuing a delegation. This patch changes the reasons used to ones that the ESXi client doesn't complain about. This change does not affect the FreeBSD client and does not appear to affect behaviour of the Linux NFSv4.1 client. RFC5661 defines these "reasons" but does not give any guidance w.r.t. which ones are more appropriate to return to a client.
# 336846	28-Jul-2018	rmacklem	MFC: r334492 Add the BindConnectiontoSession operation to the NFSv4.1 server. Under some fairly unusual circumstances, the Linux NFSv4.1 client is doing a BindConnectiontoSession operation for TCP connections. It is also used by the ESXi6.5 NFSv4.1 client. This patch adds this operation to the NFSv4.1 server. PR: 226493
# 336518	19-Jul-2018	rmacklem	MFC: r333766 Add a missing nfsrv_freesession() call for an unlikely failure case. Since NFSv4.1 clients normally create a single session which supports both fore and back channels, it is unlikely that a callback will fail due to a lack of a back channel. However, if this failure occurred, the session wasn't being dereferenced and would never be free'd. Found by inspection during pNFS server development.
# 336422	17-Jul-2018	rmacklem	MFC: r333645 End grace for the NFSv4 server if all mounts do ReclaimComplete. The NFSv4 protocol requires that the server only allow reclaim of state and not issue any new open/lock state for a grace period after booting. The NFSv4.0 protocol required this grace period to be greater than the lease duration (over 2minutes). For NFSv4.1, the client tells the server that it has done reclaiming state by doing a ReclaimComplete operation. If all NFSv4 clients are NFSv4.1, the grace period can end once all the clients have done ReclaimComplete, shortening the time period considerably. This patch does this. If there are any NFSv4.0 mounts, the grace period will still be over 2minutes. This change is only an optimization and does not affect correct operation.
# 336234	12-Jul-2018	rmacklem	MFC: r333579 The NFSv4.1 server should return NFSERR_BACKCHANBUSY instead of NFS_OK. When an NFSv4.1 session is busy due to a callback being in progress, nfsrv_freesession() should return NFSERR_BACKCHANBUSY instead of NFS_OK. The only effect this has is that the DestroySession operation will report the failure for this case and this probably has little or no effect on a client. Spotted by inspection and no failures related to this have been reported.
# 336179	10-Jul-2018	rmacklem	MFC: r333508 Add support for the TestStateID operation to the NFSv4.1 server. The Linux client now uses the TestStateID operation, so this patch adds support for it to the NFSv4.1 server. The FreeBSD client never uses this operation, so it should not be affected.
# 334741	06-Jun-2018	rmacklem	MFC: r333580 Fix a slow leak of session structures in the NFSv4.1 server. For a fairly rare case of a client doing an ExchangeID after a hard reboot, the old confirmed clientid still exists, but some clients use a new co_verifier. For this case, the server was not freeing up the sessions on the old confirmed clientid. This patch fixes this case. It also adds two LIST_INIT() macros, which are actually no-ops, since the structure is malloc()d with M_ZERO so the pointer is already set to NULL. It should have minimal impact, since the only way I could exercise this code path was by doing a hard power cycle (pulling the plus) on a machine running Linux with a NFSv4.1 mount on the server. Originally spotted during testing of the ESXi 6.5 client. PR: 228497
# 334699	06-Jun-2018	rmacklem	MFC: r334396 Strengthen locking for the NFSv4.1 server DestroySession operation. If a client did a DestroySession on a session while it was still in use, the server might try to use the session structure after it is free'd. I think the client has violated RFC5661 if it does this, but this patch makes DestroySession block all other nfsd threads so no thread could be using the session when it is free'd. After the DestroySession, nfsd threads will not be able to find the session. The patch also adds a check for nd_sessionid being set, although if that was not the case it would have been all 0s and unlikely to have a false match. This might fix the crashes described in PR#228497 for the FreeNAS server. PR: 228497
# 327008	19-Dec-2017	rmacklem	MFC: r326544 Avoid the overhead of acquiring a lock in nfsrv_checkgetattr() when there are no write delegations issued. manu@ reported on the freebsd-current@ mailing list that there was a significant performance hit in nfsrv_checkgetattr() caused by the acquisition/release of a state lock, even when there were no write delegations issued. This patch add a count of outstanding issued write delegations to the NFSv4 server. This count allows nfsrv_checkgetattr() to return without acquiring any lock when the count is 0, avoiding the performance hit for the case where no write delegations are issued.
# 325448	05-Nov-2017	rmacklem	MFC: r324639 Fix the client IP address reported by nfsdumpstate for 64bit arch and NFSv4.1. The client IP address was not being reported for some NFSv4 mounts by nfsdumpstate. Upon investigation, two problems were found for mounts using IPv4. One was that the code (originally written and tested on i386) assumed that a "u_long" was a "uint32_t" and would exactly store an IPv4 host address. Not correct for 64bit arches. Also, for NFSv4.1 mounts, the field was not being filled in. This was basically correct, because NFSv4.1 does not use a callback address. However, it meant that nfsdumpstate could not report the client IP addr. This patch should fix both of these issues. For IPv6, the address will still not be reported. The original NFSv4 RFC only specified IPv4 callback addresses. I think this has changed and, if so, a future commit to fix reporting of IPv6 addresses will be needed.
# 317986	08-May-2017	rmacklem	MFC: r317382 Allow use of a write open stateid for reading in the NFSv4 server. The NFSv4 RFCs give a server the option of allowing the use of an open stateid for write access to be used for a Read operation. This patch enables this by default and adds a sysctl to disable it, for anyone who does not want this capability. Allowing this is particularily useful for a pNFS Data Server (DS), since they are not permitted to allow the use of special stateids. Discovered during recent testing of the pNFS server under development.
# 310303	19-Dec-2016	rmacklem	MFC: r309566 Fix the NFSv4.1 server for Open reclaim after a reboot. The NFSv4.1 server failed to update the nfs-stablerestart file for a client when the client was issued its first Open. As such, recovery of Opens after a server reboot failed with NFSERR_NOGRACE. This patch fixes this. It also changes the code so that it malloc()'s the 1024 byte array instead of allocating it on the kernel stack for both NFSv4.0 and NFSv4.1. Note that this bug only affected NFSv4.1 and only when clients attempted to reclaim Opens after a server reboot.
# 308241	02-Nov-2016	rmacklem	MFC: r307694 A problem w.r.t. interoperation between the FreeBSD NFSv4.1 server with delegations enabled and the Linux NFSv4.1 client was reported in reviews.freebsd.org/D7891. I believe that the FreeBSD server behaviour conforms to the RFC and that the Linux client has a bug. Therefore, I do not think the proposed patch is appropriate. When nfsrv_writedelegifpos is non-zero, the FreeBSD server will issue a write delegation for a read open if possible. The Linux client then erroneously assumes that the credentials used for the read open can write the file. This patch reverses the default value for nfsrv_writedelegifpos to 0 so that the default behaviour is Linux compatible and adds a sysctl that can be used to set nfsrv_writedelegifpos. This change should only affect users that are mounting a FreeBSD server with delegations enabled (they are not enabled by default) with a Linux NFSv4.1 client mount.
# 306663	03-Oct-2016	rmacklem	Revert r306659 since the userland changes won't merge and this would break the build.
# 306659	03-Oct-2016	rmacklem	MFC: r304026 Update the nfsstats structure to include the changes needed by the patch in D1626 plus changes so that it includes counts for NFSv4.1 (and the draft of NFSv4.2). Also, make all the counts uint64_t and add a vers field at the beginning, so that future revisions can easily be implemented. There is code in place to handle the old vesion of the nfsstats structure for backwards binary compatibility. Subsequent commits will update nfsstat(8) to use the new fields.
# 299222	07-May-2016	rmacklem	MFC: r298495 Fix a LOR in the NFSv4.1 server. The ordering of acquisition of the state and session mutexes was reversed in two cases executed when an NFSv4.1 client created/freed a session. Since clients will typically do this only when mounting and dismounting, the likelyhood of causing a deadlock was low but possible. This can only occur for NFSv4.1 mounts, since the others do not use sessions. This was detected while testing the pNFS server/client where the client crashed during dismounting. The patch also reorders the unlocks, although that isn't necessary for correct operation.
# 291869	05-Dec-2015	rmacklem	MFC: r291150 When the nfsd threads are terminated, the NFSv4 server state (opens, locks, etc) is retained, which I believe is correct behaviour. However, for NFSv4.1, the server also retained a reference to the xprt (RPC transport socket structure) for the backchannel. This caused svcpool_destroy() to not call SVC_DESTROY() for the xprt and allowed a socket upcall to occur after the mutexes in the svcpool were destroyed, causing a crash. This patch fixes the code so that the backchannel xprt structure is dereferenced just before svcpool_destroy() is called, so the code does do an SVC_DESTROY() on the xprt, which shuts down the socket upcall.
# 287267	28-Aug-2015	rmacklem	MFC: r286790 For the case where an NFSv4.1 ExchangeID operation has the client identifier that already has a confirmed ClientID, the nfsrv_setclient() function would not fill in the clientidp being returned. As such, the value of ClientID returned would be whatever garbage was on the stack. This patch fixes the problem by filling in these fields.
# 286164	01-Aug-2015	rmacklem	MFC: r286046 This patch fixes a problem where, if the NFSv4 server has a previous unconfirmed clientid structure for the same client on the last hash list, this old entry would not be removed/deleted. I do not think this bug would have caused serious problems, since the new entry would have been before the old one on the list. This old entry would have eventually been scavenged/removed.
# 284318	12-Jun-2015	rmacklem	Add TUNABLE_INT() macros so that the tunables MFC'd from head as r284216 are set via /boot/loader.conf in stable/10. This is a direct commit to stable/10 because TUNABLE_INT() is deprecated in head.
# 284216	10-Jun-2015	rmacklem	MFC: r283635 Make the size of the hash tables used by the NFSv4 server tunable. No appreciable change in performance was observed after increasing the sizes of these tables and then testing with a single client. However, there was an email that indicated high CPU overheads for a heavily loaded NFSv4 and it is hoped that increasing the sizes of the hash tables via these tunables might help. The tables remain the same size by default.
# 276437	30-Dec-2014	rmacklem	MFC: r276193 A deadlock in the NFSv4 server with vfs.nfsd.enable_locallocks=1 was reported via email. This was caused by a LOR between the sleep lock used to serialize the local locking (nfsrv_locklf()) and locking the vnode. I believe this patch fixes the problem by delaying relocking of the vnode until the sleep lock is unlocked (nfsrv_unlocklf()). To avoid nfsvno_advlock() having the side effect of unlocking the vnode, unlocking the vnode was moved to before the functions that call nfsvno_advlock(). It shouldn't affect the execution of the default case where vfs.nfsd.enable_locallocks=0.
# 269398	01-Aug-2014	rmacklem	MFC: r268115 Merge the NFSv4.1 server code in projects/nfsv4.1-server over into head. The code is not believed to have any effect on the semantics of non-NFSv4.1 server behaviour. It is a rather large merge, but I am hoping that there will not be any regressions for the NFS server.
# 265621	07-May-2014	rmacklem	MFC: r264845 Remove an unnecessary level of indirection for an argument. This simplifies the code and should avoid the clang sparc port from generating an abort() call.
# 260159	01-Jan-2014	rmacklem	MFC: r259854 The NFSv4 server would call VOP_SETATTR() with a shared locked vnode when a Getattr for a file is done by a client other than the one that holds the file's delegation. This would only happen when delegations are enabled and the problem is fixed by this patch.
# 287267	28-Aug-2015	rmacklem	MFC: r286790 For the case where an NFSv4.1 ExchangeID operation has the client identifier that already has a confirmed ClientID, the nfsrv_setclient() function would not fill in the clientidp being returned. As such, the value of ClientID returned would be whatever garbage was on the stack. This patch fixes the problem by filling in these fields.
# 286164	01-Aug-2015	rmacklem	MFC: r286046 This patch fixes a problem where, if the NFSv4 server has a previous unconfirmed clientid structure for the same client on the last hash list, this old entry would not be removed/deleted. I do not think this bug would have caused serious problems, since the new entry would have been before the old one on the list. This old entry would have eventually been scavenged/removed.
# 284318	12-Jun-2015	rmacklem	Add TUNABLE_INT() macros so that the tunables MFC'd from head as r284216 are set via /boot/loader.conf in stable/10. This is a direct commit to stable/10 because TUNABLE_INT() is deprecated in head.
# 284216	10-Jun-2015	rmacklem	MFC: r283635 Make the size of the hash tables used by the NFSv4 server tunable. No appreciable change in performance was observed after increasing the sizes of these tables and then testing with a single client. However, there was an email that indicated high CPU overheads for a heavily loaded NFSv4 and it is hoped that increasing the sizes of the hash tables via these tunables might help. The tables remain the same size by default.
# 276437	30-Dec-2014	rmacklem	MFC: r276193 A deadlock in the NFSv4 server with vfs.nfsd.enable_locallocks=1 was reported via email. This was caused by a LOR between the sleep lock used to serialize the local locking (nfsrv_locklf()) and locking the vnode. I believe this patch fixes the problem by delaying relocking of the vnode until the sleep lock is unlocked (nfsrv_unlocklf()). To avoid nfsvno_advlock() having the side effect of unlocking the vnode, unlocking the vnode was moved to before the functions that call nfsvno_advlock(). It shouldn't affect the execution of the default case where vfs.nfsd.enable_locallocks=0.
# 269398	01-Aug-2014	rmacklem	MFC: r268115 Merge the NFSv4.1 server code in projects/nfsv4.1-server over into head. The code is not believed to have any effect on the semantics of non-NFSv4.1 server behaviour. It is a rather large merge, but I am hoping that there will not be any regressions for the NFS server.
# 265621	07-May-2014	rmacklem	MFC: r264845 Remove an unnecessary level of indirection for an argument. This simplifies the code and should avoid the clang sparc port from generating an abort() call.
# 260159	01-Jan-2014	rmacklem	MFC: r259854 The NFSv4 server would call VOP_SETATTR() with a shared locked vnode when a Getattr for a file is done by a client other than the one that holds the file's delegation. This would only happen when delegations are enabled and the problem is fixed by this patch.