#
03a39a17 |
|
27-Apr-2024 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clear out a lot of cruft related to B_DIRECT There is only one place in the unpatched sources where B_DIRECT is set in the NFS client and this code is never executed. As such, this patch removes this code that is never executed, since B_DIRECT should never be set. During a IETF testing event this week, I saw a crash in ncl_doio_directwrite(), but this function is only called if B_DIRECT is set. I cannot explain how ncl_doio_directwrite() got called, but once this patch was applied to the sources, the crash did not recur. This is not surprising, since this patch deleted the function. Reviewed by: kib, markj MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D44980
|
#
d00c64bb |
|
11-Apr-2024 |
Zaphrod Beeblebrox <zbeeble@gmail.com> |
nfscl: Purge name cache when readdir_plus is done The author reported that this patch was needed to avoid crashes on a fairly busy RISC-V system. The author did not provide details w.r.t. the crashes. Although I have not seen any such crash, the patch looks reasonable and I have not found any regressions when testing it. Since "rdirplus" is not a default option, the patch is only needed if you are doing NFS mounts with the "rdirplus" mount option and seeing crashes related to the name cache. MFC after: 1 week
|
#
cc760de2 |
|
11-Jan-2024 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Only update atime for Copy when noatime is not specified Commit 57ce37f9dcd0 modified the NFSv4.2 Copy operation so that it will update atime on the infd file whenever possible. This is done by adding a Setattr of TimeAccess for the input file. This patch disables this change for the case of an NFSv4.2 mount with the "noatime" mount option, which avoids the additional Setattr of TimeAccess operation. MFC after: 1 week
|
#
b068bb09 |
|
07-Jan-2024 |
Konstantin Belousov <kib@FreeBSD.org> |
Add vnode_pager_clean_{a,}sync(9) Bump __FreeBSD_version for ZFS use. Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43356
|
#
656d2e83 |
|
30-Dec-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
nfsclient: eliminate ncl_writebp() Use plain bufwrite() instead. ncl_writebp() evolved to mostly repeat bufwrite() code with some ommisions, most notably runningbufspace accounting. Reviewed by: imp, markj, rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43249
|
#
47ec00d9 |
|
29-Dec-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
nfsclient: flush dirty pages of the vnode before ncl_flush() when done to ensure that the server sees our cached data, because it potentially changes the server response. This is relevant for copy_file_range(), seek(), and allocate(). Convert LK_SHARED invp lock into LK_EXCLUSIVE if needed to properly call vm_object_page_clean(). Reported by: asomers PR: 276002 Noted and reviewed by: rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43250
|
#
7dae1467 |
|
29-Dec-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
nfsclient copy_file_range(): flush dst vnode data Otherwise server-side copy makes the client cache inconsistent with the server data. Reported by: asomers PR: 276002 Reviewed by: rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D43250
|
#
6fa843f6 |
|
11-Dec-2023 |
Mark Johnston <markj@FreeBSD.org> |
nfsclient: Propagate copyin() errors from nfsm_uiombuf() Approved by: so Security: SA-23:18.nfsclient Reviewed by: rmacklem Sponsored by: The FreeBSD Foundation
|
#
c5405d1c |
|
18-Nov-2023 |
Konstantin Belousov <kib@FreeBSD.org> |
vn_copy_file_range(): provide ENOSYS fallback to vn_generic_copy_file_range() Reviewed by: markj, Olivier Certner <olce.freebsd@certner.fr> Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D42603
|
#
196787f7 |
|
20-Oct-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Use Claim_Null_FH and Claim_Deleg_Cur_FH For NFSv4.1/4.2, there are two new options for the Open operation. These two options use the file handle for the file instead of the file handle for the directory plus a file name. By doing so, the client code is simplified (it no longer needs the "nfsv4node" structure attached to the NFS vnode). It also avoids problems caused by another NFS client (or process running locally in the NFS server) doing a rename or remove of the file name between the Lookup and Open. Unfortunately, there was a bug (fixed recently by commit X) in the NFS server which mis-parsed the Claim_Deleg_Cur_FH arguments. To allow this patch to work with the broken FreeBSD NFSv4.1/4.2 server, NFSMNTP_BUGGYFBSDSRV is defined and is set when a correctly formatted Claim_Deleg_Cur_FH fails with NFSERR_EXPIRED. (This is what the old, broken NFS server does, since it erroneously uses the Getattr arguments as a stateID.) Once this flag is set, the client fills in a stateID, to make the broken NFS server happy. Tested at a recent IETF NFSv4 Bakeathon. MFC after: 1 month
|
#
57ce37f9 |
|
18-Oct-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Make NFSv4.2 Copy set atime on infd RFC7862 does not specify infile atime behaviour when a NFSv4.2 Copy operation is performed. Since the collective opinion of a mailing list discussion (on freebsd-hackers@) seemed to indicate that copy_file_range(2) should update atime on the infd, even if there is no data copied, this patch attempts to ensure that behaviour. For Copy, it preceeds the Copy operation with a Setattr of TimeAccess_Set(NFSv4. speak for atime) for the invp. For the case where no data will be copied, it does a Setattr RPC to set TimeAccess_Set for the invp. A __FreeBSD_version bump will be done as a separate commit, since this patch changes the internal interface between the nfscommon and nfscl modules. MFC after: 1 month
|
#
969071be |
|
06-Sep-2023 |
Martin Matuska <mm@FreeBSD.org> |
vfs: copy_file_range() between multiple mountpoints of the same fs type VOP_COPY_FILE_RANGE(9) is now caled when source and target vnodes reside on the same filesystem type (not just on the same mountpoint). The check if vnodes are on the same mountpoint must be done in the filesystem code. There are currently only three users - fusefs(5) already has this check, ZFS can handle multiple mountpoints and a check has been added to NFS client. ZFS block cloning is now possible between all snapshots and datasets of the same ZFS pool. MFC after: 1 week Reviewed by: rmacklem Differential Revision: https://reviews.freebsd.org/D41721
|
#
685dc743 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: one-line .c pattern Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
|
#
1512579a |
|
27-Mar-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Make coverity happy Coverity does not like code that checks a function's return value sometimes. Add "(void)" in front of the function when the return value does not matter to try and make it happy. Reported by: emaste MFC after: 3 months
|
#
896516e5 |
|
16-Mar-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add a new NFSv4.1/4.2 mount option for Kerberized mounts Without this patch, a Kerberized NFSv4.1/4.2 mount must provide a Kerberos credential for the client at mount time. This credential is typically referred to as a "machine credential". It can be created one of two ways: - The user (usually root) has a valid TGT at the time the mount is done and this becomes the machine credential. There are two problems with this. 1 - The user doing the mount must have a valid TGT for a user principal at mount time. As such, the mount cannot be put in fstab(5) or similar. 2 - When the TGT expires, the mount breaks. - The client machine has a service principal in its default keytab file and this service principal (typically called a host-based initiator credential) is used as the machine credential. There are problems with this approach as well: 1 - There is a certain amount of administrative overhead creating the service principal for the NFS client, creating a keytab entry for this principal and then copying the keytab entry into the client's default keytab file via some secure means. 2 - The NFS client must have a fixed, well known, DNS name, since that FQDN is in the service principal name as the instance. This patch uses a feature of NFSv4.1/4.2 called SP4_NONE, which allows the state maintenance operations to be performed by any authentication mechanism, to do these operations via AUTH_SYS instead of RPCSEC_GSS (Kerberos). As such, neither of the above mechanisms is needed. It is hoped that this option will encourage adoption of Kerberized NFS mounts using TLS, to provide a more secure NFS mount. This new NFSv4.1/4.2 mount option, called "syskrb5" must be used with "sec=krb5[ip]" to avoid the need for either of the above Kerberos setups to be done by the client. Note that all file access/modification operations still require users on the NFS client to have a valid TGT recognized by the NFSv4.1/4.2 server. As such, this option allows, at most, a malicious client to do some sort of DOS attack. Although not required, use of "tls" with this new option is encouraged, since it provides on-the-wire encryption plus, optionally, client identity verification via a X.509 certificate provided to the server during TLS handshake. Alternately, "sec=krb5p" does provide on-the-wire encryption of file data. A mount_nfs(8) man page update will be done in a separate commit. Discussed on: freebsd-current@ MFC after: 3 months
|
#
847967bc |
|
08-Feb-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Fix interaction between mmap'd and VOP_WRITE file updates asomers@ found a problem with the NFS client, where a write to an NFS mounted file done via mmap(2) was lost when fspacectl(2) was done before it. This turned out to be caused by clearing the dirty bit on pages when the client was doing commit RPCs, due to the second argument to vfs_busy_pages() being set to 1. Commit RPCs tell the server to commit previously written data to stable storage. However, Commit RPCs do not write data from the client to the server. As such, if the dirty bit on the page has been set by a mmap'd write to an address in the page, it should not be cleared. Clearing it causes the mmap'd write to by lost. This patch fixes the problem by changing the 2nd argument to vfs_busy_pages() to 0 for this case. I doubt this bug has affected many, since it was inherited from the old NFS client and was in 4.3 FreeBSD twenty years ago. Although fspacectl(2) is FreeBSD 14 specific, a write(2) would cause the same failure. Reviewed by: kib Tested by: asomers PR: 269328 MFC after: 1 week
|
#
a82308ab |
|
01-Oct-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfs_clvnops.c: Fix access to v_mount when vnode unlocked Commit ab17854f974b fixed access to v_mount when the vnode is unlocked for nfs_copy_file_range(). This patch does the same for nfs_advlockasync(). MFC after: 1 week
|
#
bffb3d94 |
|
01-Oct-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfs_clvnops.c: Fix access to v_mount when vnode unlocked Commit ab17854f974b fixed access to v_mount when the vnode is unlocked for nfs_copy_file_range(). This patch does the same for nfs_ioctl(). Reviewed by: kib, markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D36846
|
#
ab17854f |
|
26-Sep-2022 |
Konstantin Belousov <kib@FreeBSD.org> |
nfsclient: access v_mount only after the vnode is locked and we checked that it is not reclaimed. Reviewed by: markj, rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D36722
|
#
52360ca3 |
|
25-Sep-2022 |
Alan Somers <asomers@FreeBSD.org> |
copy_file_range: truncate write if it would exceed RLIMIT_FSIZE PR: 266611 MFC after: 2 weeks Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D36706
|
#
5b5b7e2c |
|
17-Sep-2022 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: always retain path buffer after lookup This removes some of the complexity needed to maintain HASBUF and allows for removing injecting SAVENAME by filesystems. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D36542
|
#
33721eb9 |
|
04-Sep-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Allow "nolockd" to work for NFSv4 mounts Commit 40ada74ee1da modified the NFSv4.1/4.2 client so that it would issue a DestroySession to the server when all session slots are marked bad. This handles the case where session slots get broken when "intr" or "soft" NFSv4 fairly well.1/4.2 mounts are done. There are two other cases where having an NFSv4.1/4.2 RPC attempt terminate without completion can leave state in a non-determinate condition. One is file locking RPCs. If the "nolockd" option is used, this avoids file locking RPCs by doing locking locally within the client. The other is Open locks, but since all FreeBSD Open locks are done with OPEN_SHARE_DENY_NONE, the locking state for these should not be critical. This patch enables use of "nolockd" for NFSv4 mounts, so that it can be combined with "intr" and/or "soft", making the latter more usable. Use of "intr" or "soft" NFSv4 mounts are still not recommended, but when combined with "nolockd" should now work fairly well. A man page update will be done as a separate commit. MFC after: 2 weeks
|
#
3c4266ed |
|
17-Jun-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing unused arguments The "void *stuff" (also called fstuff and dstuff) argument was used by the Mac OSX port. For FreeBSD, this argument is always NULL, so remove it to clean up the code. This commit gets rid of "stuff" for assorted functions defined in nfs_clrpcops.c and called in nfs_clvnops.c and nfs_clstate.c. This commit should not result in a semantics change.
|
#
1e70163c |
|
17-Jun-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing unused arguments The "void *stuff" (also called fstuff and dstuff) argument was used by the Mac OSX port. For FreeBSD, this argument is always NULL, so remove it to clean up the code. This commit gets rid of "stuff" for assorted functions defined in nfs_clrpcops.c and called in nfs_clvnops.c and nfs_clvfsops.c. Future commits will do the same for other functions. This commit should not result in a semantics change.
|
#
c692ea40 |
|
16-Jun-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing unused arguments The "void *stuff" (also called fstuff and dstuff) argument was used by the Mac OSX port. For FreeBSD, this argument is always NULL, so remove it to clean up the code. This commit gets rid of "stuff" for assorted functions defined in nfs_clrpcops.c and called in nfs_clvnops.c. Future commits will do the same for other functions. This commit should not result in a semantics change.
|
#
af6665e0 |
|
16-Jun-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing unused arguments The "void *stuff" (also called fstuff and dstuff) argument was used by the Mac OSX port. For FreeBSD, this argument is always NULL, so remove it to clean up the code. This commit gets rid of "stuff" for assorted functions defined in nfs_clrpcops.c and called in nfs_clvnops.c. Future commits will do the same for other functions. This commit should not result in a semantics change.
|
#
8cb42d69 |
|
15-Jun-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing unused arguments The "void *stuff" (also called fstuff and dstuff) argument was used by the Mac OSX port. For FreeBSD, this argument is always NULL, so remove it to clean up the code. This commit gets rid of "stuff" for assorted functions defined in nfs_clrpcops.c and called in nfs_clvnops.c. Future commits will do the same for other functions. This commit should not result in a semantics change.
|
#
da47c186 |
|
15-Jun-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing unused arguments The "void *stuff" (also called fstuff and dstuff) argument was used by the Mac OSX port. For FreeBSD, this argument is always NULL, so remove it to clean up the code. This commit gets rid of "stuff" for assorted functions defined in nfs_clrpcops.c and called in nfs_clvnops.c. Future commits will do the same for other functions. This commit should not result in a semantics change.
|
#
1c665e95 |
|
14-Jun-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing unused arguments The "void *stuff" (also called fstuff and dstuff) argument was used by the Mac OSX port. For FreeBSD, this argument is always NULL, so remove it to clean up the code. This commit gets rid of "stuff" for assorted functions defined in nfs_clrpcops.c and called in nfs_clvnops.c. Future commits will do the same for other functions. This commit should not result in a semantics change.
|
#
41c029d5 |
|
13-Jun-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing unused arguments The "void *stuff" (also called fstuff and dstuff) argument was used by the Mac OSX port. For FreeBSD, this argument is always NULL, so remove it to clean up the code. This commit gets rid of "stuff" for assorted functions defined in nfs_clrpcops.c and called in nfs_clvnops.c. Future commits will do the same for other functions. This commit should not result in a semantics change.
|
#
9792c7d3 |
|
31-May-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Enable support for the Lookup+Open RPC Commits 3ad1e1c1ce20 and 57014f21e754 added a Lookup+Open RPC for NFSv4.1/4.2, which can reduce the RPC count by 10-20% for some loads. This has now received a fair amount of testing, so I think it is ok to enable it. Note that the Lookup+Open RPC is only used when the "oneopenown" mount option is specified. As such, this change won't affect most NFSv4.1/4.2 mounts.
|
#
5218d82c |
|
30-Apr-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add support for a NFSv4 AppendWrite RPC For IO_APPEND VOP_WRITE()s, the code first does a Getattr RPC to acquire the file's size, before it can do the Write RPC. Although NFS does not have an append write operation, an NFSv4 compound can use a Verify operation to check that the client's notion of the file's size is correct, followed by the Write operation. This patch modifies the NFSv4 client to use an Appendwrite RPC, which does a Verify to check the file's size before doing the Write. This avoids the need for a Getattr RPC to preceed this RPC and reduces the RPC count by half for IO_APPEND writes, so long as the client knows the file's size. The nfsd structure was moved from the stack to be malloc()'d, since the kernel stack limit was being exceeded. While here, fix the types of a few variables, although there should not be any semantics change caused by these type changes.
|
#
068fc057 |
|
14-Apr-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing unused arguments The "void *stuff" (also called fstuff and dstuff) argument was used by the Mac OSX port. For FreeBSD, this argument is always NULL, so remove it to clean up the code. This commit gets rid of "stuff" for nfscl_nget(). Future commits will do the same for other functions.
|
#
4ad3423b |
|
13-Apr-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Clean up the code by removing unused arguments The "void *stuff" (also called fstuff and dstuff) argument was used by the Mac OSX port. For FreeBSD, this argument is always NULL, so remove it to clean up the code. This commit gets rid of "stuff" for nfscl_loadattrcache(). Future commits will do the same for other functions.
|
#
f37dc50d |
|
17-Mar-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Do not do a Lookup+Open for pNFS mounts A NFSv4.1/4.2 pNFS mount needs to do a separate Open+LayoutGet RPC, so do not do a Lookup+Open RPC for these mounts. The Lookup+Open RPCs are still disabled, until further testing is done, so this patch has no effect at this time.
|
#
150da1e3 |
|
16-Dec-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Partially revert commit 867c27c23a5c Commit 867c27c23a5c enabled the n_directio_opens code in open/close, which sets/clears NNONCACHE, for IO_APPEND. This code should not be enabled unless newnfs_directio_enable is non-zero. This patch reverts that part of commit 867c27c23a5c. A future patch that fixes the case where the file that is being written IO_APPEND is mmap()'d. MFC after: 3 months
|
#
867c27c2 |
|
15-Dec-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Change IO_APPEND writes to direct I/O IO_APPEND writes have always been very slow over NFS, due to the need to acquire an up to date file size after flushing all writes to the NFS server. This patch switches the IO_APPEND writes to use direct I/O, bypassing the buffer cache. As such, flushing of writes normally only occurs when the open(..O_APPEND..) is done. It does imply that all writes must be done synchronously and must be committed to stable storage on the file server (NFSWRITE_FILESYNC). For a simple test program that does 10,000 IO_APPEND writes in a loop, performance improved significantly with this patch. For a UFS exported file system, the test ran 12x faster. This drops to 3x faster when the open(2)/close(2) are done for each loop iteration. For a ZFS exported file system, the test ran 40% faster. The much smaller improvement may have been because the ZFS file system I tested against does not have a ZIL log and does have "sync" enabled. Note that IO_APPEND write performance is still much slower than when done on local file systems. Although this is a simple patch, it does result in a significant semantics change, so I have given it a large MFC time. Tested by: otis MFC after: 3 months
|
#
fe04c911 |
|
13-Dec-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: add a filesize limit check to nfs_allocate() As reported in PR#260343, nfs_allocate() did not check the filesize rlimit. This patch adds that check. PR: 260343 Reviewed by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D33422
|
#
c3134a6a |
|
27-Nov-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Disable use of the LookupOpen RPC The LookupOpen RPC reduces the number of Open RPCs needed. Unfortunately, it breaks certain software builds over NFS, so disable it until this is fixed. The LookupOpen RPC is only used for NFSv4.1/4.2 mounts when the "oneopenown" mount option is specified, so this should not affect many users.
|
#
8ef0c11e |
|
15-Nov-2021 |
Konstantin Belousov <kib@FreeBSD.org> |
nfsclient: upgrade vnode lock in VOP_OPEN()/VOP_CLOSE() if we need to flush buffers VOP_FSYNC() asserts that the vnode is exclusively locked for NFS. If we try to execute file with recently modified content, the assert is triggered. Reviewed by: rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D32999
|
#
f0c9847a |
|
06-Nov-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
vfs: Add "ioflag" and "cred" arguments to VOP_ALLOCATE When the NFSv4.2 server does a VOP_ALLOCATE(), it needs the operation to be done for the RPC's credential and not td_ucred. It also needs the writing to be done synchronously. This patch adds "ioflag" and "cred" arguments to VOP_ALLOCATE() and modifies vop_stdallocate() to use these arguments. The VOP_ALLOCATE.9 man page will be patched separately. Reviewed by: khng, kib Differential Revision: https://reviews.freebsd.org/D32865
|
#
6b677534 |
|
03-Nov-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Fix forced dismount from looping on commit When a forced dismount is in progress, it is possible to end up looping, retrying commits that fail. This patch fixes the problem by pretending that commits succeeded when a forced dismount is in prgress. MFC after: 2 weeks
|
#
50dcff08 |
|
30-Oct-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add setting n_localmodtime to the Write RPC code Similar to commit 2be417843a, I believe there could be a race between the NFS client VOP_LOOKUP() and file Writing that could result in stale file attributes being loaded into the NFS vnode by VOP_LOOKUP(). I have not been able to reproduce a failure due to this race, but I believe that there are two possibilities: The Lookup RPC happens while VOP_WRITE() is being executed and loads stale file attributes after VOP_WRITE() returns when it has already completed the Write/Commit RPC(s). --> For this case, setting the local modify timestamp at the end of VOP_WRITE() should ensure that stale file attributes are not loaded. The Lookup RPC occurs after VOP_WRITE() has returned, while asynchronous Write/Commit RPCs are in progress and then is blocked by the vnode held by VOP_OPEN/VOP_CLOSE/VOP_FSYNC which will flush writes via ncl_flush() or ncl_vinvalbuf(), clearing the NMODIFIED flag (which indicates Writes-in-progress). The VOP_LOOKUP() then acquires the NFS vnode lock and fills in stale file attributes. --> Setting the local modify timestamp in ncl_flsuh() and ncl_vinvalbuf() when they clear NMODIFIED should ensure that stale file attributes are not loaded. This patch does the above. PR: 259071 Reviewed by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D32677
|
#
ab87c39c |
|
30-Oct-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Set n_localmodtime in Deallocate Commit 2be417843a04 added n_localmodtime, which is used by Lookup and ReaddirPlus to check to see if the file attributes in an RPC reply might be stale. This patch sets n_localmodtime in Deallocate. Done as a separate commit, since Deallocate is not in stable/13. PR: 259071 Reviewed by: asomers Differential Revision: https://reviews.freebsd.org/D32635
|
#
2be41784 |
|
30-Oct-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
PR#259071 provides a test program that fails for the NFS client. Testing with it, there appears to be a race between Lookup and VOPs like Setattr-of-size, where Lookup ends up loading stale attributes (including what might be the wrong file size) into the NFS vnode's attribute cache. The race occurs when the modifying VOP (which holds a lock on the vnode), blocks the acquisition of the vnode in Lookup, after the RPC (with now potentially stale attributes). Here's what seems to happen: Child Parent does stat(), which does VOP_LOOKUP(), doing the Lookup RPC with the directory vnode locked, acquiring file attributes valid at this point in time blocks waiting for locked file does ftruncate(), which vnode does VOP_SETATTR() of Size, changing the file's size while holding an exclusive lock on the file's vnode releases the vnode lock acquires file vnode and fills in now stale attributes including the old wrong Size does a read() which returns wrong data size This patch fixes the problem by saving a timestamp in the NFS vnode in the VOPs that modify the file (Setattr-of-size, Allocate). Then lookup/readdirplus compares that timestamp with the time just before starting the RPC after it has acquired the file's vnode. If the modifying RPC occurred during the Lookup, the attributes in the RPC reply are discarded, since they might be stale. With this patch the test program works as expected. Note that the test program does not fail on a July stable/12, although this race is in the NFS client code. I suspect a fairly recent change to the name caching code exposed this bug. PR: 259071 Reviewed by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D32635
|
#
b4a58fbf |
|
01-Oct-2021 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove cn_thread It is always curthread. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D32453
|
#
235891a1 |
|
10-Oct-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Fix NFS VOP_ALLOCATE for mounts without Allocate support Without this patch, nfs_allocate() fell back on using vop_stdallocate() for NFS mounts without Allocate operation support. This was incorrect, since some file systems, such as ZFS, cannot do allocate via vop_stdallocate(), which uses writes to try and allocate blocks. Also, fix nfs_allocate() to return EINVAL when mounts cannot do Allocate, since that is the correct error for posix_fallocate(2). Note that Allocate is only supported by some NFSv4.2 servers. MFC after: 2 weeks
|
#
ad6dc365 |
|
18-Sep-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Use vfs.nfs.maxalloclen to limit Deallocate RPC RTT Unlike Copy, the NFSv4.2 Allocate and Deallocate operations do not allow a reply with partial completion. As such, the only way to limit the time the operation takes to provide a reasonable RPC RTT is to limit the size of the allocation/deallocation in the NFSv4.2 client. This patch uses the sysctl vfs.nfs.maxalloclen to set the limit on the size of the Deallocate operation. There is no way to know how long a server will take to do an deallocate operation, but 64Mbytes results in a reasonable RPC RTT for the slow hardware I test on. For an 8Gbyte deallocation, the elapsed time for doing it in 64Mbyte chunks was the same (within margin of variability) as the elapsed time taken for a single large deallocation operation for a FreeBSD server with a UFS file system.
|
#
9ebe4b8c |
|
15-Sep-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add vfs.nfs.maxalloclen to limit Allocate/Deallocate RPC RTT Unlike Copy, the NFSv4.2 Allocate and Deallocate operations do not allow a reply with partial completion. As such, the only way to limit the time the operation takes to provide a reasonable RPC RTT is to limit the size of the allocation/deallocation in the NFSv4.2 client. This patch adds a sysctl called vfs.nfs.maxalloclen to set the limit on the size of the Allocate operation. There is no way to know how long a server will take to do an allocate operation, but 64Mbytes results in a reasonable RPC RTT for the slow hardware I test on, so that is what the default value for vfs.nfs.maxalloclen is set to. For an 8Gbyte allocation, the elapsed time for doing it in 64Mbyte chunks was the same as the elapsed time taken for a single large allocation operation for a FreeBSD server with a UFS file system. MFC after: 2 weeks
|
#
08b9cc31 |
|
27-Aug-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add a VOP_DEALLOCATE() for the NFSv4.2 client This patch adds a VOP_DEALLOCATE() to the NFS client. For NFSv4.2 servers that support the Deallocate operation, it is used. Otherwise, it falls back on calling vop_stddeallocate(). Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D31640
|
#
3ad1e1c1 |
|
11-Aug-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add a Lookup+Open RPC for NFSv4.1/4.2 This patch adds a Lookup+Open compound RPC to the NFSv4.1/4.2 NFS client, which can be used by nfs_lookup() so that a subsequent Open RPC is not required. It uses the cn_flags OPENREAD, OPENWRITE added by commit c18c74a87c15. This reduced the number of RPCs by about 15% for a kernel build over NFS. For now, use of Lookup+Open is only done when the "oneopenown" mount option is used. It may be possible for Lookup+Open to be used for non-oneopenown NFSv4.1/4.2 mounts, but that will require extensive further testing to determine if it works. While here, I've added the changes to the nfscommon module that are needed to implement the Deallocate NFSv4.2 operation. This avoids needing another cycle of changes to the internal KAPI between the NFS modules. This commit has changed the internal KAPI between the NFS modules and, as such, all need to be rebuilt from sources. I have not bumped __FreeBSD_version, since it was bumped a few days ago.
|
#
efea1bc1 |
|
28-Jul-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Cache an open stateid for the "oneopenown" mount option For NFSv4.1/4.2, if the "oneopenown" mount option is used, there is, at most, only one open stateid for each NFS vnode. When an open stateid for a file is acquired, set a pointer to the open structure in the NFS vnode. This pointer can be used to acquire the open stateid without searching the open linked list when the following is true: - No delegations have been issued for the file. Since delegations can outlive an NFS vnode for a file, use the global NFSMNTP_DELEGISSUED flag on the mount to determine this. - No lock stateid has been issued for the file. To determine this, a new NFS vnode flag called NMIGHTBELOCKED is set when a lock stateid is issued, which can then be tested. When this open structure pointer can be used, it avoids the need to acquire the NFSCLSTATELOCK() and searching the open structure list for an open. The NFSCLSTATELOCK() can be highly contended when there are a lot of opens issued for the NFSv4.1/4.2 mount. This patch only affects NFSv4.1/4.2 mounts when the "oneopenown" mount option is used. MFC after: 2 weeks
|
#
dd02d9d6 |
|
07-May-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add support for va_birthtime to NFSv4 There is a NFSv4 file attribute called TimeCreate that can be used for va_birthtime. r362175 added some support for use of TimeCreate. This patch completes support of va_birthtime by adding support for setting this attribute to the server. It also eanbles the client to acquire and set the attribute for a NFSv4 server that supports the attribute. Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30156
|
#
8bde6d15 |
|
04-May-2021 |
Mark Johnston <markj@FreeBSD.org> |
nfsclient: Copy only initialized fields in nfs_getattr() When loading attributes from the cache, the NFS client is careful to copy only the fields that it initialized. After fetching attributes from the server, however, it would copy the entire vattr structure initialized from the RPC response, so uninitialized stack bytes would end up being copied to userspace. In particular, va_birthtime (v2 and v3) and va_gen (v3) had this problem. Use a common subroutine to copy fields provided by the NFS client, and ensure that we provide a dummy va_gen for the v3 case. Reviewed by: rmacklem Reported by: KMSAN MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D30090
|
#
15bed8c4 |
|
28-Feb-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsclient: add nfs node locking around uses of n_direofoffset During code inspection I noticed that the n_direofoffset field of the NFS node was being manipulated without any lock being held to make it SMP safe. This patch adds locking of the NFS node's mutex around handling of n_direofoffset to make it SMP safe. I have not seen any failure that could be attributed to n_direofoffset being manipulated concurrently by multiple processors, but I think this is possible, since directories are read with shared vnode locking, plus locks only on individual buffer cache blocks. However, there have been as yet unexplained issues w.r.t reading large directories over NFS that could have conceivably been caused by concurrent manipulation of n_direofoffset. MFC after: 2 weeks
|
#
3e04ab36 |
|
28-Feb-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsclient: add checks for a server returning the current directory Commit 3fe2c68ba20f dealt with a panic in cache_enter_time() where the vnode referred to the directory argument. It would also be possible to get these panics if a broken NFS server were to return the directory as an new object being created within the directory or in a Lookup reply. This patch adds checks to avoid the panics and logs messages to indicate that the server is broken for the file object creation cases. Reviewd by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D28987
|
#
9f669985 |
|
30-Sep-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the NFSv4.2 VOP_COPY_FILE_RANGE() client call to return after one successful RPC. Without this patch, the NFSv4.2 VOP_COPY_FILE_RANGE() client call would loop until the copy "len" was completed. The problem with doing this is that it might take a considerable time to complete for a large "len". By returning after a single successful Copy RPC that copied some of the data, the application that did the copy_file_range(2) syscall will be more responsive to signal delivery for large "len" copies.
|
#
586ee69f |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fs: clean up empty lines in .c and .h files
|
#
d292b194 |
|
05-Aug-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: remove the obsolete privused argument from vaccess This brings argument count down to 6, which is passable without the stack on amd64.
|
#
eea79fde |
|
17-Jun-2020 |
Alan Somers <asomers@FreeBSD.org> |
Remove vfs_statfs and vnode_mount macros from NFS These macro definitions are no longer needed as the NFS OSX port is long dead. The vfs_statfs macro conflicts with the vfsops field of the same name. Submitted by: shivank@ Reviewed by: rmacklem MFC after: 2 weeks Sponsored by: Google, Inc. (GSoC 2020) Differential Revision: https://reviews.freebsd.org/D25263
|
#
fb8ed4c5 |
|
14-Apr-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the NFSv2 extended attribute support to handle 0 length attributes. I did not realize that zero length attributes are allowed, but they are. This patch fixes the NFSv4.2 client and server to handle zero length extended attributes correctly. Submitted by: Frank van der Linden <fllinden@amazon.com> (earlier version) Reported by: Frank van der Linden <fllinder@amazon.com>
|
#
6a5abb1e |
|
02-Feb-2020 |
Kyle Evans <kevans@FreeBSD.org> |
Provide O_SEARCH O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping permissions checks on the directory itself after the initial open(). This is close to the semantics we've historically applied for O_EXEC on a directory, which is UB according to POSIX. Conveniently, O_SEARCH on a file is also explicitly undefined behavior according to POSIX, so O_EXEC would be a fine choice. The spec goes on to state that O_SEARCH and O_EXEC need not be distinct values, but they're not defined to be the same value. This was pointed out as an incompatibility with other systems that had made its way into libarchive, which had assumed that O_EXEC was an alias for O_SEARCH. This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a directory is checked in vn_open_vnode already, so for completeness we add a NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not re-check that when descending in namei. [0] https://pubs.opengroup.org/onlinepubs/9699919799/ Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D23247
|
#
b249ce48 |
|
03-Jan-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: drop the mostly unused flags argument from VOP_UNLOCK Filesystems which want to use it in limited capacity can employ the VOP_UNLOCK_FLAGS macro. Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D21427
|
#
6fa079fc |
|
15-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: flatten vop vectors This eliminates the following loop from all VOP calls: while(vop != NULL && \ vop->vop_spare2 == NULL && vop->vop_bypass == NULL) vop = vop->vop_default; Reviewed by: jeff Tesetd by: pho Differential Revision: https://reviews.freebsd.org/D22738
|
#
bf6ac05a |
|
12-Dec-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add some more initializations to quiet riscv build. The one case in nfs_copy_file_range() was a legitimate case, although it would probably never occur in practice.
|
#
c057a378 |
|
12-Dec-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for NFSv4.2 to the NFS client and server. This patch adds support for NFSv4.2 (RFC-7862) and Extended Attributes (RFC-8276) to the NFS client and server. NFSv4.2 is comprised of several optional features that can be supported in addition to NFSv4.1. This patch adds the following optional features: - posix_fadvise(POSIX_FADV_WILLNEED/POSIX_FADV_DONTNEED) - posix_fallocate() - intra server file range copying via the copy_file_range(2) syscall --> Avoiding data tranfer over the wire to/from the NFS client. - lseek(SEEK_DATA/SEEK_HOLE) - Extended attribute syscalls for "user" namespace attributes as defined by RFC-8276. Although this patch is fairly large, it should not affect support for the other versions of NFS. However it does add two new sysctls that allow a sysadmin to limit which minor versions of NFSv4 a server supports, allowing a sysadmin to disable NFSv4.2. Unfortunately, when the NFS stats structure was last revised, it was assumed that there would be no additional operations added beyond what was specified in RFC-7862. However RFC-8276 did add additional operations, forcing the NFS stats structure to revised again. It now has extra unused entries in all arrays, so that future extensions to NFSv4.2 can be accomodated without revising this structure again. A future commit will update nfsstat(1) to report counts for the new NFSv4.2 specific operations/procedures. This patch affects the internal interface between the nfscommon, nfscl and nfsd modules and, as such, they all must be upgraded simultaneously. I will do a version bump (although arguably not needed), due to this. This code has survived a "make universe" but has not been built with a recent GCC. If you encounter build problems, please email me. Relnotes: yes
|
#
abd80ddb |
|
08-Dec-2019 |
Mateusz Guzik <mjg@FreeBSD.org> |
vfs: introduce v_irflag and make v_type smaller The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715
|
#
9698d992 |
|
29-Nov-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
In nfs_lock(), recheck vp->v_data after lock before accessing it. We might race with reclaim, and then this is no longer a nfs vnode, in which case we do not need to handle deferred vnode_pager_setsize() either. Reported by: rk@ronald.org PR: 242184 Sponsored by: The FreeBSD Foundation MFC after: 3 days
|
#
67d0e293 |
|
29-Oct-2019 |
Jeff Roberson <jeff@FreeBSD.org> |
Replace OBJ_MIGHTBEDIRTY with a system using atomics. Remove the TMPFS_DIRTY flag and use the same system. This enables further fault locking improvements by allowing more faults to proceed with a shared lock. Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D22116
|
#
c6ba06d8 |
|
22-Oct-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix interface between nfsclient and vnode pager. Make the nfsclient always call vnode_pager_setsize() with the vnode exclusively locked. This ensures that page fault always can find the backing page if the object size check succeeded. Set VV_VMSIZEVNLOCK flag on NFS nodes. The main offender breaking the interface in nfsclient is nfs_loadattrcache(), which is used whenever server responded with updated attributes, which can happen on non-changing operations as well. Also, iod threads only have buffers locked (and even that is LK_KERNPROC), but they still may call nfs_loadattrcache() on RPC response. Instead of immediately calling vnode_pager_setsize() if server response indicated changed file size, but the vnode is not exclusively locked, set a new node flag NVNSETSZSKIP. When the vnode exclusively locked, or when we can temporary upgrade the lock to exclusive, call vnode_pager_setsize(), by providing the nfsclient VOP_LOCK() implementation. Tested by: pho Discussed with: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D21883
|
#
5d85e12f |
|
23-Sep-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Replace all mtx_lock()/mtx_unlock() on n_mtx with the macros. For a long time, some places in the NFS code have locked/unlocked the NFS node lock with the macros NFSLOCKNODE()/NFSUNLOCKNODE() whereas others have simply used mtx_lock()/mtx_unlock(). Since the NFS node mutex needs to change to an sx lock so it can be held when vnode_pager_setsize() is called, replace all occurrences of mtx_lock/mtx_unlock with the macros to simply making the change to an sx lock in future commit. There is no semantic change as a result of this commit. I am not sure if the change to an sx lock will be MFC'd soon, so I put an MFC of 1 week on this commit so that it could be MFC'd with that commit. Suggested by: kib MFC after: 1 week
|
#
a6935d08 |
|
05-Sep-2019 |
Conrad Meyer <cem@FreeBSD.org> |
Remove long-dead BUF_ASSERT_{,UN}HELD assertions These were fully neutered in r177676 (2008), but not removed at the time for unclear reasons. They're totally dead code, so go ahead and yank them now. No functional change.
|
#
6aab442a |
|
30-May-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Get rid of extraneous initialization. Get rid of an extraneous initialization, mainly to keep a static analyser happy. No semantic change. PR: 238167 Submitted by: Alexey Dokuchaev
|
#
26fd36b2 |
|
30-May-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Clean up silly code case. This silly code segment has existed in the sources since it was brought into FreeBSD 10 years ago. I honestly have no idea why this was done. It was possible that I thought that it might have been better to not set B_ASYNC for the "else" case, but I can't remember. Anyhow, this patch gets rid of the if/else that does the same thing either way, since it looks silly and upsets a static analyser. This will have no semantic effect on the NFS client. PR: 238167
|
#
65417f5e |
|
24-May-2019 |
Alan Somers <asomers@FreeBSD.org> |
Remove "struct ucred*" argument from vtruncbuf vtruncbuf takes a "struct ucred*" argument. AFAICT, it's been unused ever since that function was first added in r34611. Remove it. Also, remove some "struct ucred" arguments from fuse and nfs functions that were only used by vtruncbuf. Reviewed by: cem MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D20377
|
#
391918a3 |
|
06-May-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not flush NFS node from NFS VOP_SET_TEXT(). The more appropriate place to do the flushing is VOP_OPEN(). This was uncovered because VOP_SET_TEXT() is now called with the vnode' vm_object rlocked, which is incompatible with the flush operations. After the move, there is no need for NFS-specific VOP_SET_TEXT overload. Sponsored by: The FreeBSD Foundation MFC after: 30 days
|
#
78022527 |
|
05-May-2019 |
Konstantin Belousov <kib@FreeBSD.org> |
Switch to use shared vnode locks for text files during image activation. kern_execve() locks text vnode exclusive to be able to set and clear VV_TEXT flag. VV_TEXT is mutually exclusive with the v_writecount > 0 condition. The change removes VV_TEXT, replacing it with the condition v_writecount <= -1, and puts v_writecount under the vnode interlock. Each text reference decrements v_writecount. To clear the text reference when the segment is unmapped, it is recorded in the vm_map_entry backed by the text file as MAP_ENTRY_VN_TEXT flag, and v_writecount is incremented on the map entry removal The operations like VOP_ADD_WRITECOUNT() and VOP_SET_TEXT() check that v_writecount does not contradict the desired change. vn_writecheck() is now racy and its use was eliminated everywhere except access. Atomic check for writeability and increment of v_writecount is performed by the VOP. vn_truncate() now increments v_writecount around VOP_SETATTR() call, lack of which is arguably a bug on its own. nullfs bypasses v_writecount to the lower vnode always, so nullfs vnode has its own v_writecount correct, and lower vnode gets all references, since object->handle is always lower vnode. On the text vnode' vm object dealloc, the v_writecount value is reset to zero, and deadfs vop_unset_text short-circuit the operation. Reclamation of lowervp always reclaims all nullfs vnodes referencing lowervp first, so no stray references are left. Reviewed by: markj, trasz Tested by: mjg, pho Sponsored by: The FreeBSD Foundation MFC after: 1 month Differential revision: https://reviews.freebsd.org/D19923
|
#
f5fdf82d |
|
11-Mar-2019 |
Simon J. Gerraty <sjg@FreeBSD.org> |
Add _PC_ACL_* to vop_stdpathconf This avoid EINVAL from tmpfs etc. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D19512
|
#
6ad8a6ea |
|
06-Nov-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Change nfs_advlock() so that the NFSVOPUNLOCK() is mostly done at the end. Prior to this patch, nfs_advlock() did NFSVOPUNLOCK(); return (error); in many places. This patch replaces these code sequenences with a "goto out;" and does the NFSVOPUNLOCK(); return (error); at the end of the function in order to make the vnode locking simpler. This patch does not change the semantics of nfs_advlock(). Suggested by: kib Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D17853
|
#
881a9516 |
|
01-Nov-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix NFS client vnode locking to avoid a crash during forced dismount. A crash was reported where the crash occurred in nfs_advlock() when the NFS_ISV4(vp) macro was being executed. This was caused by the vnode being VI_DOOMED due to a forced dismount in progress. This patch fixes the problem by locking the vnode before executing the NFS_ISV4() macro. Tested by: rlibby PR: 232673 Reviewed by: kib MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D17757
|
#
8ff7fad1 |
|
23-Oct-2018 |
Konstantin Belousov <kib@FreeBSD.org> |
Only call sigdeferstop() for NFS. Use bypass to catch any NFS VOP dispatch and route it through the wrapper which does sigdeferstop() and then dispatches original VOP. NFS does not need a bypass below it, which is not supported. The vop offset in the vop_vector is added since otherwise it is impossible to get vop_op_t from the internal table, and I did not wanted to create the layered fs only to wrap NFS VOPs. VFS_OP()s wrap is straightforward. Requested and reviewed by: mjg (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D17658
|
#
222daa42 |
|
25-Jan-2018 |
Conrad Meyer <cem@FreeBSD.org> |
style: Remove remaining deprecated MALLOC/FREE macros Mechanically replace uses of MALLOC/FREE with appropriate invocations of malloc(9) / free(9) (a series of sed expressions). Something like: * MALLOC(a, b, ... -> a = malloc(... * FREE( -> free( * free((caddr_t) -> free( No functional change. For now, punt on modifying contrib ipfilter code, leaving a definition of the macro in its KMALLOC(). Reported by: jhb Reviewed by: cy, imp, markj, rmacklem Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D14035
|
#
d821d364 |
|
21-Jan-2018 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
Unsign some values related to allocation. When allocating memory through malloc(9), we always expect the amount of memory requested to be unsigned as a negative value would either stand for an error or an overflow. Unsign some values, found when considering the use of mallocarray(9), to avoid unnecessary casting. Also consider that indexes should be of at least the same size/type as the upper limit they pretend to index. MFC after: 3 weeks
|
#
ac2fffa4 |
|
21-Jan-2018 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
Revert r327828, r327949, r327953, r328016-r328026, r328041: Uses of mallocarray(9). The use of mallocarray(9) has rocketed the required swap to build FreeBSD. This is likely caused by the allocation size attributes which put extra pressure on the compiler. Given that most of these checks are superfluous we have to choose better where to use mallocarray(9). We still have more uses of mallocarray(9) but hopefully this is enough to bring swap usage to a reasonable level. Reported by: wosch PR: 225197
|
#
d48d1a64 |
|
15-Jan-2018 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
nfsclient: make some use of mallocarray(9). Focus on code where we are doing multiplications within malloc(9). None of these ire likely to overflow, however the change is still useful as some static checkers can benefit from the allocation attributes we use for mallocarray. This initial sweep only covers malloc(9) calls with M_NOWAIT. No good reason but I started doing the changes before r327796 and at that time it was convenient to make sure the sorrounding code could handle NULL values. X-Differential revision: https://reviews.freebsd.org/D13837
|
#
95c8838c |
|
21-Dec-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix build for LP64 arches with gcc. gcc complaints that the comparision is always false due to the value range, and the cast does not prevent the analysis. Split the LP64 vs. ILP32 clamping as a workaround. Sponsored by: The FreeBSD Foundation
|
#
b501cc5d |
|
19-Dec-2017 |
John Baldwin <jhb@FreeBSD.org> |
Rework pathconf handling for FIFOs. On the one hand, FIFOs should respect other variables not supported by the fifofs vnode operation (such as _PC_NAME_MAX, _PC_LINK_MAX, etc.). These values are fs-specific and must come from a fs-specific method. On the other hand, filesystems that support FIFOs are required to support _PC_PIPE_BUF on directory vnodes that can contain FIFOs. Given this latter requirement, once the fs-specific VOP_PATHCONF method supports _PC_PIPE_BUF for directories, it is also suitable for FIFOs permitting a single VOP_PATHCONF method to be used for both FIFOs and non-FIFOs. To that end, retire all of the FIFO-specific pathconf methods from filesystems and change FIFO-specific vnode operation switches to use the existing fs-specific VOP_PATHCONF method. For fifofs, set it's VOP_PATHCONF to VOP_PANIC since it should no longer be used. While here, move _PC_PIPE_BUF handling out of vop_stdpathconf() so that only filesystems supporting FIFOs will report a value. In addition, only report a valid _PC_PIPE_BUF for directories and FIFOs. Discussed with: bde Reviewed by: kib (part of a larger patch) MFC after: 1 month Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D12572
|
#
a0a073b1 |
|
19-Dec-2017 |
John Baldwin <jhb@FreeBSD.org> |
Update NFS to handle larger link counts post ino64. - Define a NFS_LINK_MAX as UINT32_MAX to match the wire protocol. - Use NFS_LINK_MAX instead of LINK_MAX as the fallback value reported for a PATHCONF RPC by the NFS server. - Use NFS_LINK_MAX instead of LINK_MAX as the default value reported by the NFS client pathconf() if not overridden by the NFS server. - When reading the link count out of an RPC reply, read the full 32 bits instead of the lower 16 bits. Reviewed by: rmacklem (earlier version) Sponsored by: Chelsio Communications
|
#
51369649 |
|
20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
|
#
e5cffdd3 |
|
20-Aug-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not drop NFS vnode lock when performing consistency checks. Currently several paths in the NFS client upgrade the shared vnode lock to exclusive, which might cause temporal dropping of the lock. This action appears to be fatal for nullfs mounts over NFS. If the operation is performed over nullfs vnode, then bypassed down to NFS VOP, and the lock is dropped, other thread might reclaim the upper nullfs vnode. Since on reclaim the nullfs vnode lock and NFS vnode lock are split, the original lock state of the nullfs vnode is not restored. As result, VFS operations receive not locked vnode after a VOP call. Stop upgrading the vnode lock when we check the consistency or flush buffers as result of detected inconsistency. Instead, allocate a new lockmgr lock for each NFS node, which is locked exclusive instead of the vnode lock upgrade. In other words, the other parallel modification of the vnode are excluded by either vnode lock conflict or exclusivity of the new lock when the vnode lock is shared. Also revert r316529 because now the vnode cannot be reclaimed during ncl_vinvalbuf(). In collaboration with: pho Reviewed by: rmacklem Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D12083
|
#
16f300fa |
|
27-Jul-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Replace the checks for MNTK_UNMOUNTF with a macro that does the same thing. This patch defines a macro that checks for MNTK_UNMOUNTF and replaces explicit checks with this macro. It has no effect on semantics, but prepares the code for a future patch where there will also be a NFS specific flag for "forced dismount about to occur". Suggested by: kib MFC after: 2 weeks
|
#
15a88f81 |
|
11-Jul-2017 |
John Baldwin <jhb@FreeBSD.org> |
Consistently use vop_stdpathconf() for default pathconf values. Update filesystems not currently using vop_stdpathconf() in pathconf VOPs to use vop_stdpathconf() for any configuration variables that do not have filesystem-specific values. vop_stdpathconf() is used for variables that have system-wide settings as well as providing default values for some values based on system limits. Filesystems can still explicitly override individual settings. PR: 219851 Reported by: cem Reviewed by: cem, kib, ngie MFC after: 1 month Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D11541
|
#
81b07aac |
|
25-Jun-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support to the NFSv4.1/pNFS client for commits through the DS. A NFSv4.1/pNFS server using File Layout can specify that Commit operations are to be done against the DS instead of MDS. Since no extant pNFS server did this, the code was untested and "#ifdef notyet". The FreeBSD pNFS server I am developing does specify that Commits be done through the DS, so the code has been enabled/tested. This patch should only affect the case of a pNFS server that specfies Commits through the DS. PR: 219551 MFC after: 2 weeks
|
#
69921123 |
|
23-May-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Commit the 64-bit inode project. Extend the ino_t, dev_t, nlink_t types to 64-bit ints. Modify struct dirent layout to add d_off, increase the size of d_fileno to 64-bits, increase the size of d_namlen to 16-bits, and change the required alignment. Increase struct statfs f_mntfromname[] and f_mntonname[] array length MNAMELEN to 1024. ABI breakage is mitigated by providing compatibility using versioned symbols, ingenious use of the existing padding in structures, and by employing other tricks. Unfortunately, not everything can be fixed, especially outside the base system. For instance, third-party APIs which pass struct stat around are broken in backward and forward incompatible ways. Kinfo sysctl MIBs ABI is changed in backward-compatible way, but there is no general mechanism to handle other sysctl MIBS which return structures where the layout has changed. It was considered that the breakage is either in the management interfaces, where we usually allow ABI slip, or is not important. Struct xvnode changed layout, no compat shims are provided. For struct xtty, dev_t tty device member was reduced to uint32_t. It was decided that keeping ABI compat in this case is more useful than reporting 64-bit dev_t, for the sake of pstat. Update note: strictly follow the instructions in UPDATING. Build and install the new kernel with COMPAT_FREEBSD11 option enabled, then reboot, and only then install new world. Credits: The 64-bit inode project, also known as ino64, started life many years ago as a project by Gleb Kurtsou (gleb). Kirk McKusick (mckusick) then picked up and updated the patch, and acted as a flag-waver. Feedback, suggestions, and discussions were carried by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles), and Rick Macklem (rmacklem). Kris Moore (kris) performed an initial ports investigation followed by an exp-run by Antoine Brodin (antoine). Essential and all-embracing testing was done by Peter Holm (pho). The heavy lifting of coordinating all these efforts and bringing the project to completion were done by Konstantin Belousov (kib). Sponsored by: The FreeBSD Foundation (emaste, kib) Differential revision: https://reviews.freebsd.org/D10439
|
#
6d4377c1 |
|
14-Apr-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Remove unused "cred" argument to ncl_flush(). The "cred" argument of ncl_flush() is unused and it was confusing to have the code passing in NULL for this argument in some cases. This patch deletes this argument. There is no semantic change because of this patch. MFC after: 2 weeks
|
#
0649fcae |
|
12-Apr-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the NFS client for "text file modified, process killed" mmap'd case. When an mmap'd text file is written and then executed immediately afterwards, it was possible that the modify time would change after the text file was executing, resulting in the process executing the file being killed. This was usually only observed when the file system's times were set to higher resolution, but could have occurred for any time resolution. This was reported on a recent email list discussion. This patch adds a VOP_SET_TEXT() to the NFS client which flushed all dirty pages to the NFS server and then makes sure that n_mtime is up to date to avoid this from occurring. Thanks go to kib@ and pho@ for their help with developing this patch. Tested by: pho Reviewed by: kib MFC after: 2 weeks
|
#
6627d919 |
|
11-Apr-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove debugging printf. Instead, issue a diagnostic and return appropriate error if ncl_flush() was unable to clean buffer queue after the specified number or retries. Reviewed by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
a046da7e |
|
05-Apr-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove spl*() calls from the nfsclient code. Style adjustments in the related lines in ncl_writebp(). Reviewed by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
48fe9263 |
|
05-Apr-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Handle possible vnode reclamation after ncl_vinvalbuf() call. ncl_vinvalbuf() might need to upgrade vnode lock, allowing the vnode to be reclaimed by other thread. Handle the situation, indicated by the returned error zero and VI_DOOMED iflag set, converting it into EBADF. Handle all calls, even where the vnode is exclusively locked right now. Reviewed by: markj, rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks X-Differential revision: https://reviews.freebsd.org/D10241
|
#
fbbd9655 |
|
28-Feb-2017 |
Warner Losh <imp@FreeBSD.org> |
Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96
|
#
1c324569 |
|
06-Jan-2017 |
Konstantin Belousov <kib@FreeBSD.org> |
Use type-independent formats for printing nlink_t and ino_t. Extracted from: ino64 work by gleb, mckusick Discussed with: mckusick Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
753a007f |
|
22-Nov-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Use buffer pager for NFS. The pager, due to its construction, implements clustering for the page-ins. In particular, buildworld load demonstrates reduction of the READ RPCs from 39k down to 24k. No change in real or CPU time was observed. Discussed with, and measured by: bde No objections from: rmacklem Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
fc2c3afe |
|
22-Nov-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Minor cleanup, remove unneeded XXX comments and unused re-define. Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
1b819cf2 |
|
12-Aug-2016 |
Rick Macklem <rmacklem@FreeBSD.org> |
Update the nfsstats structure to include the changes needed by the patch in D1626 plus changes so that it includes counts for NFSv4.1 (and the draft of NFSv4.2). Also, make all the counts uint64_t and add a vers field at the beginning, so that future revisions can easily be implemented. There is code in place to handle the old vesion of the nfsstats structure for backwards binary compatibility. Subsequent commits will update nfsstat(8) to use the new fields. Submitted by: will (earlier version) Reviewed by: ken MFC after: 1 month Relnotes: yes Differential Revision: https://reviews.freebsd.org/D1626
|
#
ad600ac8 |
|
03-Aug-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Remove ncl_printf(), use printf(9) directly. After r303710 the function duplicates printf(). Correct function names in the messages [*]. Noted by: bde [*] Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
e37dfd3d |
|
19-Jun-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Do not access NFS data for reclaimed vnode. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week Approved by: re (delphij)
|
#
b6a60ae7 |
|
11-May-2016 |
Konstantin Belousov <kib@FreeBSD.org> |
Use vfs_hash_ref(9) to eliminate LK_EXCLOTHER kludge. As a consequence, the nfs client override of VOP_LOCK1() is no longer needed. Reviewed and tested by: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
a96c9b30 |
|
29-Apr-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
NFS: spelling fixes on comments. No funcional change.
|
#
74b8d63d |
|
10-Apr-2016 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
Cleanup unnecessary semicolons from the kernel. Found with devel/coccinelle.
|
#
86b9457f |
|
21-May-2015 |
Rick Macklem <rmacklem@FreeBSD.org> |
The NFS client wasn't handling getdirentries(2) requests for sizes that are not an exact multiple of DIRBLKSIZ correctly. Fortunately readdir(3) always uses an exact multiple of DIRBLKSIZ, so few applications were affected. This patch fixes this problem by reducing the size of the directory read to an exact multiple of DIRBLKSIZ. Tested by: trasz Reported by: trasz Reviewed by: trasz MFC after: 2 weeks
|
#
2f88b3d2 |
|
25-Dec-2014 |
Rick Macklem <rmacklem@FreeBSD.org> |
Delete some duplicate code that was harmless because exactly the same code is at the end of the nfscl_checksattr() function that is called just before it. As such, this code had already been executed and didn't do anything. MFC after: 1 week
|
#
6c21f6ed |
|
18-Dec-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
The VOP_LOOKUP() implementations for CREATE op do not put the name into namecache, to avoid cache trashing when doing large operations. E.g., tar archive extraction is not usually followed by access to many of the files created. Right now, each VOP_LOOKUP() implementation explicitely knowns about this quirk and tests for both MAKEENTRY flag presence and op != CREATE to make the call to cache_enter(). Centralize the handling of the quirk into VFS, by deciding to cache only by MAKEENTRY flag in VOP. VFS now sets NOCACHE flag for CREATE namei() calls. Note that the change in semantic is backward-compatible and could be merged to the stable branch, and is compatible with non-changed third-party filesystems which correctly handle MAKEENTRY. Suggested by: Chris Torek <torek@pi-coral.com> Reviewed by: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
65589a29 |
|
16-Jul-2014 |
Konstantin Belousov <kib@FreeBSD.org> |
Check for the cross-device cross-link attempt in the VFS, instead of forcing filesystem VOP_LINK() methods to repeat the code. In tmpfs_link(), remove redundand check for the type of the source, already done by VFS. Note that NFS server already performs this check before calling VOP_LINK(). Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks
|
#
c0990eda |
|
23-Apr-2014 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the NFSv4 client's Pathconf RPC (actually a Getattr Op.) so that it only does the RPC for names that are answered by the RPC. Doing the RPC for other names is harmless, but unnecessary. MFC after: 2 weeks
|
#
c7b560b9 |
|
21-Apr-2014 |
Rick Macklem <rmacklem@FreeBSD.org> |
For an NFSv4 mount with the "nocto" option, don't get the up to date file attributes upon close. This reduces the Getattr RPC count by about 65% for software builds. MFC after: 2 weeks
|
#
cf766161 |
|
07-Dec-2013 |
Rick Macklem <rmacklem@FreeBSD.org> |
For software builds, the NFS client does many small synchronous (with FILE_SYNC) writes because non-contiguous byte ranges in the same buffer cache block are being written. This patch adds a new mount option "noncontigwr" which allows the non-contiguous byte ranges to be combined, with the dirty byte range becoming the superset of the bytes that are dirty, if the file has not been file locked. This reduces the number of writes significantly for software builds. The only case where this change might break existing applications is where an application is writing non-overlapping byte ranges within the same buffer cache block of a file from multiple clients concurrently. Since such an application would normally do file locking on the file, avoiding the byte range merge for files that have been file locked should be sufficient for most (maybe all?) cases. Submitted by: jhb (earlier version) Reviewed by: kib MFC after: 3 weeks
|
#
54366c0b |
|
25-Nov-2013 |
Attilio Rao <attilio@FreeBSD.org> |
- For kernel compiled only with KDTRACE_HOOKS and not any lock debugging option, unbreak the lock tracing release semantic by embedding calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined version of the releasing functions for mutex, rwlock and sxlock. Failing to do so skips the lockstat_probe_func invokation for unlocking. - As part of the LOCKSTAT support is inlined in mutex operation, for kernel compiled without lock debugging options, potentially every consumer must be compiled including opt_kdtrace.h. Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES is linked there and it is only used as a compile-time stub [0]. [0] immediately shows some new bug as DTRACE-derived support for debug in sfxge is broken and it was never really tested. As it was not including correctly opt_kdtrace.h before it was never enabled so it was kept broken for a while. Fix this by using a protection stub, leaving sfxge driver authors the responsibility for fixing it appropriately [1]. Sponsored by: EMC / Isilon storage division Discussed with: rstone [0] Reported by: rstone [1] Discussed with: philip
|
#
56239558 |
|
21-Jun-2013 |
Rick Macklem <rmacklem@FreeBSD.org> |
When the NFSv4.1 client is writing to a pNFS Data Server (DS), the file's size attribute does not get updated. As such, it is necessary to invalidate the attribute cache before clearing NMODIFIED for pNFS. MFC after: 2 weeks
|
#
22a72260 |
|
30-May-2013 |
Jeff Roberson <jeff@FreeBSD.org> |
- Convert the bufobj lock to rwlock. - Use a shared bufobj lock in getblk() and inmem(). - Convert softdep's lk to rwlock to match the bufobj lock. - Move INFREECNT to b_flags and protect it with the buf lock. - Remove unnecessary locking around bremfree() and BKGRDINPROG. Sponsored by: EMC / Isilon Storage Division Discussed with: mckusick, kib, mdf
|
#
77a03c14 |
|
12-May-2013 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for the eofflag to nfs_readdir() in the new NFS client so that it works under a unionfs mount. Submitted by: Jared Yanovich (slovichon@gmail.com) Reviewed by: kib MFC after: 2 weeks
|
#
3b14c753 |
|
13-Mar-2013 |
John Baldwin <jhb@FreeBSD.org> |
Revert 195703 and 195821 as this special stop handling in NFS is now implemented via VFCF_SBDRY rather than passing PBDRY to individual sleep calls.
|
#
89f6b863 |
|
08-Mar-2013 |
Attilio Rao <attilio@FreeBSD.org> |
Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho
|
#
d177f14d |
|
18-Jan-2013 |
John Baldwin <jhb@FreeBSD.org> |
Use vfs_timestamp() to set file timestamps rather than invoking getmicrotime() or getnanotime() directly in NFS. Reviewed by: rmacklem, bde MFC after: 1 week
|
#
1f60bfd8 |
|
08-Dec-2012 |
Rick Macklem <rmacklem@FreeBSD.org> |
Move the NFSv4.1 client patches over from projects/nfsv4.1-client to head. I don't think the NFS client behaviour will change unless the new "minorversion=1" mount option is used. It includes basic NFSv4.1 support plus support for pNFS using the Files Layout only. All problems detecting during an NFSv4.1 Bakeathon testing event in June 2012 have been resolved in this code and it has been tested against the NFSv4.1 server available to me. Although not reviewed, I believe that kib@ has looked at it.
|
#
7af1242a |
|
11-May-2012 |
Rick Macklem <rmacklem@FreeBSD.org> |
PR# 165923 reported intermittent write failures for dirty memory mapped pages being written back on an NFS mount. Since any thread can call VOP_PUTPAGES() to write back a dirty page, the credentials of that thread may not have write access to the file on an NFS server. (Often the uid is 0, which may be mapped to "nobody" in the NFS server.) Although there is no completely correct fix for this (NFS servers check access on every write RPC instead of at open/mmap time), this patch avoids the common cases by holding onto a credential that recently opened the file for writing and uses that credential for the write RPCs being done by VOP_PUTPAGES() for both NFS clients. Tested by: Joel Ray Holveck (joelh at juniper.net) PR: kern/165923 Reviewed by: kib MFC after: 2 weeks
|
#
4964d807 |
|
27-Apr-2012 |
Rick Macklem <rmacklem@FreeBSD.org> |
It was reported via email that some non-FreeBSD NFS servers do not include file attributes in the reply to an NFS create RPC under certain circumstances. This resulted in a vnode of type VNON that was not usable. This patch adds an NFS getattr RPC to nfs_create() for this case, to fix the problem. It was tested by the person that reported the problem and confirmed to fix this case for their server. Tested by: Steven Haber (steven.haber at isilon.com) MFC after: 2 weeks
|
#
a53373fa |
|
17-Mar-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Add sysctl vfs.nfs.nfs_keep_dirty_on_error to switch the nfs client behaviour on error from write RPC back to behaviour of old nfs client. When set to not zero, the pages for which write failed are kept dirty. PR: kern/165927 Reviewed by: alc MFC after: 2 weeks
|
#
5e99212d |
|
02-Mar-2012 |
Rick Macklem <rmacklem@FreeBSD.org> |
Post r230394, the Lookup RPC counts for both NFS clients increased significantly. Upon investigation this was caused by name cache misses for lookups of "..". For name cache entries for non-".." directories, the cache entry serves double duty. It maps both the named directory plus ".." for the parent of the directory. As such, two ctime values (one for each of the directory and its parent) need to be saved in the name cache entry. This patch adds an entry for ctime of the parent directory to the name cache. It also adds an additional uma zone for large entries with this time value, in order to minimize memory wastage. As well, it fixes a couple of cases where the mtime of the parent directory was being saved instead of ctime for positive name cache entries. With this patch, Lookup RPC counts return to values similar to pre-r230394 kernels. Reported by: bde Discussed with: kib Reviewed by: jhb MFC after: 2 weeks
|
#
526d0bd5 |
|
20-Feb-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix found places where uio_resid is truncated to int. Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month
|
#
bf40d24a |
|
06-Feb-2012 |
John Baldwin <jhb@FreeBSD.org> |
Rename cache_lookup_times() to cache_lookup() and retire the old API and ABI stub for cache_lookup().
|
#
c480f781 |
|
06-Feb-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks
|
#
d5210589 |
|
25-Jan-2012 |
Konstantin Belousov <kib@FreeBSD.org> |
Fix remaining calls to cache_enter() in both NFS clients to provide appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks
|
#
0b17c7be |
|
25-Jan-2012 |
John Baldwin <jhb@FreeBSD.org> |
Add a timeout on positive name cache entries in the NFS client. That is, we will only trust a positive name cache entry for a specified amount of time before falling back to a LOOKUP RPC, even if the ctime for the file handle matches the cached copy in the name cache entry. The timeout is configured via a new 'nametimeo' mount option and defaults to 60 seconds. It may be set to zero to disable positive name caching entirely. Reviewed by: rmacklem MFC after: 1 week
|
#
5aefb4cb |
|
20-Jan-2012 |
John Baldwin <jhb@FreeBSD.org> |
Close a race in NFS lookup processing that could result in stale name cache entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC. MFC after: 2 weeks
|
#
a5e583ee |
|
15-Nov-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the new NFS client so that nfs_fsync() only calls ncl_flush() for regular files. Since other file types don't write into the buffer cache, calling ncl_flush() is almost a no-op. However, it does clear the NMODIFIED flag and this shouldn't be done by nfs_fsync() for directories. MFC after: 2 weeks
|
#
670bf6f1 |
|
13-Nov-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Since NFSv4 byte range locking only works for regular files, add a sanity check for the vnode type to the NFSv4 client. MFC after: 2 weeks
|
#
d1907de2 |
|
30-Jul-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
The new NFS client failed to vput() the new vnode if a setattr failed after the file was created in nfs_create(). This would probably only happen during a forced dismount. The old NFS client does have a vput() for this case. Detected by pho during recent testing, where an open syscall returned with a vnode still locked. Tested by: pho Approved by: re (kib) MFC after: 2 weeks
|
#
68347a92 |
|
16-Jul-2011 |
Zack Kirsch <zack@FreeBSD.org> |
Simple find/replace of VOP_ISLOCKED -> NFSVOPISLOCKED. This is done so that NFSVOPISLOCKED can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
#
a9989634 |
|
16-Jul-2011 |
Zack Kirsch <zack@FreeBSD.org> |
Simple find/replace of VOP_UNLOCK -> NFSVOPUNLOCK. This is done so that NFSVOPUNLOCK can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
#
98f234f3 |
|
16-Jul-2011 |
Zack Kirsch <zack@FreeBSD.org> |
Simple find/replace of vn_lock -> NFSVOPLOCK. This is done so that NFSVOPLOCK can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks
|
#
8f0e65c9 |
|
18-Jun-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add DTrace support to the new NFS client. This is essentially cloned from the old NFS client, plus additions for NFSv4. A review of this code is in progress, however it was felt by the reviewer that it could go in now, before code slush. Any changes required by the review can be committed as bug fixes later.
|
#
fb35711d |
|
05-Jun-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for flock(2) locks to the new NFSv4 client. I think this should be ok, since the client now delays NFSv4 Close operations until VOP_INACTIVE()/VOP_RECLAIM(). As such, there should be no risk that the NFSv4 Open is closed while an associated byte range lock still exists. Tested by: avg MFC after: 2 weeks
|
#
f8f4e256 |
|
05-Jun-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
The new NFSv4 client was erroneously using "p" instead of "p_leader" for the "id" for POSIX byte range locking. I think this would only have affected processes created by rfork(2) with the RFTHREAD flag specified. This patch fixes that by passing the "id" down through the various functions from nfs_advlock(). MFC after: 2 weeks
|
#
b398d106 |
|
31-May-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the new NFS client so that it doesn't do an NFSv3 Pathconf RPC for cases where the reply doesn't include the answer. This fixes a problem reported by avg@ where the NFSv3 Pathconf RPC would fail when "ls -l" did an lpathconf(2) for _PC_ACL_NFS4. Tested by: avg MFC after: 2 weeks
|
#
81ddb192 |
|
25-May-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add some missing mutex locking to the new NFS client. MFC after: 2 weeks
|
#
147206ae |
|
25-May-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the new NFS client so that it correctly sets the "must_commit" argument for a write RPC when it succeeds for the first one and fails for a subsequent RPC within the same call to the function. This makes it compatible with the old NFS client for this case. MFC after: 2 weeks
|
#
76036f2b |
|
22-May-2011 |
Alan Cox <alc@FreeBSD.org> |
Eliminate duplicate #include's.
|
#
1f376590 |
|
15-May-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Change the sysctl naming for the old and new NFS clients to vfs.oldnfs.xxx and vfs.nfs.xxx respectively. This makes the default nfs client use vfs.nfs.xxx after r221124.
|
#
e2f2b370 |
|
04-May-2011 |
Ruslan Ermilov <ru@FreeBSD.org> |
Implemented a mount option "nocto" that disables cache coherency checking at open time. It may improve performance for read-only NFS mounts. Use deliberately. MFC after: 1 week Reviewed by: rmacklem, jhb (earlier version)
|
#
bc62b5cf |
|
17-Apr-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add a vput() to nfs_lookitup() in the experimental NFS client for a case that will probably never happen. It can only happen if a server were to successfully lookup a file, but not return attributes for that file. Although technically allowed by the NFSv3 RFC, I doubt any server would ever do this. However, if it did, the client would have not vput()'d the new vnode when it needed to do so. MFC after: 2 weeks
|
#
ab42af27 |
|
17-Apr-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add vput() calls in two places in the experimental NFS client that would be needed if, in the future, nfscl_loadattrcache() were to return an error. Currently nfscl_loadattrcache() never returns an error, so these cases never currently happen. MFC after: 2 weeks
|
#
78d8a600 |
|
17-Apr-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Change the mutex locking for several locations in the experimental NFS client's vnode op functions to make them compatible with the regular NFS client. I'll admit I'm not sure that the mutex locks around the assignments are needed, but the regular client has them, so I added them. Also, add handling of the case of partial attributes in setattr to be compatible with the regular client. MFC after: 2 weeks
|
#
0a9f005d |
|
17-Apr-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix up some of the sysctls for the experimental NFS client so that they use the same names as the regular client. Also add string descriptions for them. MFC after: 2 weeks
|
#
4b3a38ec |
|
16-Apr-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add a lktype flags argument to nfscl_nget() and ncl_nget() in the experimental NFS client so that its nfs_lookup() function can use cn_lkflags in a manner analagous to the regular NFS client. MFC after: 2 weeks
|
#
149ce102 |
|
13-Apr-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add VOP_PATHCONF() support to the experimental NFS client so that it can, along with other things, report whether or not NFS4 ACLs are supported. MFC after: 2 weeks
|
#
f93d95cb |
|
29-Oct-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify nfs_open() in the experimental NFS client to be compatible with the regular NFS client. Also, fix a couple of mutex lock issues. MFC after: 1 week
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
ca27c028 |
|
18-Oct-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the NFS clients and the NLM so that the NLM can be used by both clients. Since the NLM uses various fields of the nfsmount structure, those fields were extracted and put in a separate nfs_mountcommon structure stored in sys/nfs/nfs_mountcommon.h. This structure also has a function pointer for a function that extracts the required information from the mount point and nfs vnode for that particular client, for information stored differently by the clients. Reviewed by: jhb MFC after: 2 weeks
|
#
a8c0af59 |
|
09-Sep-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the experimental NFS client so that it doesn't panic when NFSv2,3 byte range locking is attempted. A fix that allows the nlm_advlock() to work with both clients is in progress, but may take a while. As such, I am doing this commit so that the kernel doesn't panic in the meantime. Submitted by: jh MFC after: 2 weeks
|
#
8e27c182 |
|
07-Sep-2010 |
John Baldwin <jhb@FreeBSD.org> |
Store the full timestamp when caching timestamps of files and directories for purposes of validating name cache entries. This closes races where two updates to a file or directory within the same second could result in stale entries in the name cache. While here, remove the 'n_expiry' field as it is no longer used. Reviewed by: rmacklem MFC after: 1 week
|
#
0372f5f4 |
|
04-Sep-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
Disable use of the NLM in the experimental NFS client, since it will crash the kernel because it uses the nfsmount and nfsnode structures of the regular NFS client. MFC after: 2 weeks
|
#
e3649d5a |
|
02-Aug-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the return value for nfscl_mustflush() from boolean_t, which I mistakenly thought was correct w.r.t. style(9), back to int and add the checks for != 0. This is just a stylistic modification. MFC after: 1 week
|
#
f92bbff2 |
|
24-Jul-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
Move sys/nfsclient/nfs_lock.c into sys/nfs and build it as a separate module that can be used by both the regular and experimental nfs clients. This fixes the problem reported by jh@ where /dev/nfslock would be registered twice when both nfs clients were used. I also defined the size of the lm_fh field to be the correct value, as it should be the maximum size of an NFSv3 file handle. Reviewed by: jh MFC after: 2 weeks
|
#
6ec1ef63 |
|
18-Jul-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add a call to nfscl_mustflush() in nfs_close() of the experimental NFSv4 client, so that attributes are not acquired from the server when a delegation for the file is held. This can reduce the number of Getattr Ops significantly. MFC after: 2 weeks
|
#
f9b1a4a3 |
|
15-Jul-2010 |
John Baldwin <jhb@FreeBSD.org> |
Merge 208603, 209946, and 209948 to the new NFS client: Move attribute cache flushes from VOP_OPEN() to VOP_LOOKUP() to provide more graceful recovery for stale filehandles and eliminate the need for conditionally clearing the attribute cache in the !NMODIFIED case in VOP_OPEN(). Reviewed by: rmacklem MFC after: 2 weeks
|
#
b38f7723 |
|
12-Jun-2010 |
Konstantin Belousov <kib@FreeBSD.org> |
In NFS clients, instead of inconsistently using #ifdef DIAGNOSTIC and #ifndef DIAGNOSTIC for debug assertions, prefer KASSERT(). Also change one #ifdef DIAGNOSTIC in the new nfs server. Submitted by: Mikolaj Golub <to.my.trociny gmail com> MFC after: 2 weeks
|
#
227b9ebe |
|
30-Apr-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
MFC: r207170 An NFSv4 server will reply NFSERR_GRACE for non-recovery RPCs during the grace period after startup. This grace period must be at least the lease duration, which is typically 1-2 minutes. It seems prudent for the experimental NFS client to wait a few seconds before retrying such an RPC, so that the server isn't flooded with non-recovery RPCs during recovery. This patch adds an argument to nfs_catnap() to implement a 5 second delay for this case.
|
#
52d8bf89 |
|
29-Apr-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
MFC: r207082 When the experimental NFS client is handling an NFSv4 server reboot with delegations enabled, the recovery could fail if the renew thread is trying to return a delegation, since it will not do the recovery. This patch fixes the above by having nfscl_recalldeleg() fail with the I/O operations returning EIO, so that they will be attempted later. Most of the patch consists of adding an argument to various functions to indicate the delegation recall case where this needs to be done.
|
#
23f929df |
|
24-Apr-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
An NFSv4 server will reply NFSERR_GRACE for non-recovery RPCs during the grace period after startup. This grace period must be at least the lease duration, which is typically 1-2 minutes. It seems prudent for the experimental NFS client to wait a few seconds before retrying such an RPC, so that the server isn't flooded with non-recovery RPCs during recovery. This patch adds an argument to nfs_catnap() to implement a 5 second delay for this case. MFC after: 1 week
|
#
67c5c2d2 |
|
22-Apr-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
When the experimental NFS client is handling an NFSv4 server reboot with delegations enabled, the recovery could fail if the renew thread is trying to return a delegation, since it will not do the recovery. This patch fixes the above by having nfscl_recalldeleg() fail with the I/O operations returning EIO, so that they will be attempted later. Most of the patch consists of adding an argument to various functions to indicate the delegation recall case where this needs to be done. MFC after: 1 week
|
#
16a11310 |
|
13-Feb-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
MFC: r203303 Patch the experimental NFS client so that there is a timeout for negative name cache entries in a manner analogous to r202767 for the regular NFS client. Also, make the code in nfs_lookup() compatible with that of the regular client and replace the sysctl variable that enabled negative name caching with the mount point option.
|
#
dd5b5a94 |
|
31-Jan-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
Patch the experimental NFS client so that there is a timeout for negative name cache entries in a manner analogous to r202767 for the regular NFS client. Also, make the code in nfs_lookup() compatible with that of the regular client and replace the sysctl variable that enabled negative name caching with the mount point option. MFC after: 2 weeks
|
#
6049e7fc |
|
11-Jan-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
MFC: r201029 When porting the experimental nfs subsystem to the FreeBSD8 krpc, I added 3 functions that were already in the experimental client under different names. This patch deletes the functions in the experimental client and renames the calls to use the other set. (This is just removal of duplicated code and does not fix any bug.)
|
#
4a8e2176 |
|
26-Dec-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
When porting the experimental nfs subsystem to the FreeBSD8 krpc, I added 3 functions that were already in the experimental client under different names. This patch deletes the functions in the experimental client and renames the calls to use the other set. (This is just removal of duplicated code and does not fix any bug.) MFC after: 2 weeks
|
#
74991298 |
|
03-Dec-2009 |
Edward Tomasz Napierala <trasz@FreeBSD.org> |
Remove unneeded ifdefs. Reviewed by: rmacklem
|
#
1bb015c0 |
|
11-Nov-2009 |
Jaakko Heinonen <jh@FreeBSD.org> |
Create verifier used by FreeBSD NFS client is suboptimal because the first part of a verifier is set to the first IP address from V_in_ifaddrhead list. This address is typically the loopback address making the first part of the verifier practically non-unique. The second part of the verifier is initialized to zero making its initial value non-unique too. This commit changes the strategy for create verifier initialization: just initialize it to a random value. Also move verifier handling into its own function and use a mutex to protect the variable. This change is a candidate for porting to sys/nfsclient. Reviewed by: jhb, rmacklem Approved by: trasz (mentor)
|
#
83864c81 |
|
28-Aug-2009 |
Marko Zec <zec@FreeBSD.org> |
MFC r196503: Fix NFS panics with options VIMAGE kernels by apropriately setting curvnet context inside the RPC code. Temporarily set td's cred to mount's cred before calling socreate() via __rpc_nconf2socket(). Submitted by: rmacklem (in part) Reviewed by: rmacklem, rwatson Discussed with: dfr, bz Approved by: re (rwatson), julian (mentor) Approved by: re (rwatson)
|
#
0348c661 |
|
24-Aug-2009 |
Marko Zec <zec@FreeBSD.org> |
Fix NFS panics with options VIMAGE kernels by apropriately setting curvnet context inside the RPC code. Temporarily set td's cred to mount's cred before calling socreate() via __rpc_nconf2socket(). Submitted by: rmacklem (in part) Reviewed by: rmacklem, rwatson Discussed with: dfr, bz Approved by: re (rwatson), julian (mentor) MFC after: 3 days
|
#
6a795918 |
|
29-Jul-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the experimental nfs client so that it only calls ncl_vinvalbuf() for NFSv2 and not NFSv4 when nfscl_mustflush() returns 0. Since nfscl_mustflush() only returns 0 when there is a valid write delegation issued to the client, it only affects the case of an NFSv4 mount with callbacks/delegations enabled. Approved by: re (kensmith), kib (mentor)
|
#
c79e6976 |
|
22-Jul-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add changes to the experimental nfs client to use the PBDRY flag for msleep(9) when a vnode lock or similar may be held. The changes are just a clone of the changes applied to the regular nfs client by r195703. Approved by: re (kensmith), kib (mentor)
|
#
405229f9 |
|
14-Jul-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the experimental nfs client so that it does not cause a "share->excl" panic when doing a lookup of dotdot at the root of a server's file system. The patch avoids calling vn_lock() for that case, since nfscl_nget() has already acquired a lock for the vnode. Approved by: re (kensmith), kib (mentor)
|
#
eddfbb76 |
|
14-Jul-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Build on Jeff Roberson's linker-set based dynamic per-CPU allocator (DPCPU), as suggested by Peter Wemm, and implement a new per-virtual network stack memory allocator. Modify vnet to use the allocator instead of monolithic global container structures (vinet, ...). This change solves many binary compatibility problems associated with VIMAGE, and restores ELF symbols for virtualized global variables. Each virtualized global variable exists as a "reference copy", and also once per virtual network stack. Virtualized global variables are tagged at compile-time, placing the in a special linker set, which is loaded into a contiguous region of kernel memory. Virtualized global variables in the base kernel are linked as normal, but those in modules are copied and relocated to a reserved portion of the kernel's vnet region with the help of a the kernel linker. Virtualized global variables exist in per-vnet memory set up when the network stack instance is created, and are initialized statically from the reference copy. Run-time access occurs via an accessor macro, which converts from the current vnet and requested symbol to a per-vnet address. When "options VIMAGE" is not compiled into the kernel, normal global ELF symbols will be used instead and indirection is avoided. This change restores static initialization for network stack global variables, restores support for non-global symbols and types, eliminates the need for many subsystem constructors, eliminates large per-subsystem structures that caused many binary compatibility issues both for monitoring applications (netstat) and kernel modules, removes the per-function INIT_VNET_*() macros throughout the stack, eliminates the need for vnet_symmap ksym(2) munging, and eliminates duplicate definitions of virtualized globals under VIMAGE_GLOBALS. Bump __FreeBSD_version and update UPDATING. Portions submitted by: bz Reviewed by: bz, zec Discussed with: gnn, jamie, jeff, jhb, julian, sam Suggested by: peter Approved by: re (kensmith)
|
#
ad86aef9 |
|
12-Jul-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the handling of dotdot in lookup for the experimental nfs client in a manner analagous to the change in r195294 for the regular nfs client. Approved by: re (kensmith), kib (mentor)
|
#
2d9cfaba |
|
25-Jun-2009 |
Robert Watson <rwatson@FreeBSD.org> |
Add a new global rwlock, in_ifaddr_lock, which will synchronize use of the in_ifaddrhead and INADDR_HASH address lists. Previously, these lists were used unsynchronized as they were effectively never changed in steady state, but we've seen increasing reports of writer-writer races on very busy VPN servers as core count has gone up (and similar configurations where address lists change frequently and concurrently). For the time being, use rwlocks rather than rmlocks in order to take advantage of their better lock debugging support. As a result, we don't enable ip_input()'s read-locking of INADDR_HASH until an rmlock conversion is complete and a performance analysis has been done. This means that one class of reader-writer races still exists. MFC after: 6 weeks Reviewed by: bz
|
#
ebd8672c |
|
17-Jun-2009 |
Bjoern A. Zeeb <bz@FreeBSD.org> |
Add explicit includes for jail.h to the files that need them and remove the "hidden" one from vimage.h.
|
#
81e3c4fc |
|
17-Jun-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix handling of ".." in nfs_lookup() for the forced dismount case by cribbing the change made to the regular nfs client in r194358. Approved by: kib (mentor)
|
#
410654ec |
|
09-Jun-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Since vn_lock() with the LK_RETRY flag never returns an error for FreeBSD-CURRENT, the code that checked for and returned the error was broken. Change it to check for VI_DOOMED set after vn_lock() and return an error for that case. I believe this should only happen for forced dismounts. Approved by: kib (mentor)
|
#
705fe7ce |
|
31-May-2009 |
Marko Zec <zec@FreeBSD.org> |
Unbreak options VIMAGE kernel builds. Approved by: julian (mentor)
|
#
6ffb637d |
|
22-May-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Change the code in the experimental nfs client to avoid flushing writes upon close when a write delegation is held by the client. This should be safe to do, now that nfsv4 Close operations are delayed until ncl_inactive() is called for the vnode. Approved by: kib (mentor)
|
#
47a59856 |
|
18-May-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Change the experimental NFSv4 client so that it does not do the NFSv4 Close operations until ncl_inactive(). This is necessary so that the Open StateIDs are available for doing I/O on mmap'd files after VOP_CLOSE(). I also changed some indentation for the nfscl_getclose() function. Approved by: kib (mentor)
|
#
9ec7b004 |
|
04-May-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add the experimental nfs subtree to the kernel, that includes support for NFSv4 as well as NFSv2 and 3. It lives in 3 subdirs under sys/fs: nfs - functions that are common to the client and server nfsclient - a mutation of sys/nfsclient that call generic functions to do RPCs and handle state. As such, it retains the buffer cache handling characteristics and vnode semantics that are found in sys/nfsclient, for the most part. nfsserver - the server. It includes a DRC designed specifically for NFSv4, that is used instead of the generic DRC in sys/rpc. The build glue will be checked in later, so at this point, it consists of 3 new subdirs that should not affect kernel building. Approved by: kib (mentor)
|