#
dd7d42a1 |
|
23-Oct-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl/kgssapi: Fix Kerberized NFS mounts to pNFS servers During recent testing related to the IETF NFSv4 Bakeathon, it was discovered that Kerberized NFSv4.1/4.2 mounts to pNFS servers (sec=krb5[ip],pnfs mount options) was broken. The FreeBSD client was using the "service principal" for the MDS to try and establish a rpcsec_gss credential for a DS, which is incorrect. (A "service principal" looks like "nfs@<fqdn-of-server>" and the <fqdn-of-server> for the DS is not the same as the MDS for most pNFS servers.) To fix this, the rpcsec_gss code needs to be able to do a reverse DNS lookup of the DS's IP address. A new kgssapi upcall to the gssd(8) daemon is added by this patch to do the reverse DNS along with a new rpcsec_gss function to generate the "service principal". A separate patch to the gssd(8) will be committed, so that this patch will fix the problem. Without the gssd(8) patch, the new upcall fails and current/incorrect behaviour remains. This bug only affects the rare case of a Kerberized (sec=krb5[ip],pnfs) mount using pNFS. This patch changes the internal KAPI between the kgssapi and nfscl modules, but since I did a version bump a few days ago, I will not do one this time. MFC after: 1 month
|
#
95ee2897 |
|
16-Aug-2023 |
Warner Losh <imp@FreeBSD.org> |
sys: Remove $FreeBSD$: two-line .h pattern Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
|
#
f4179ad4 |
|
01-Apr-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscommon: Add support for an NFSv4 operation bitmap NFSv4.1/4.2 uses operation bitmaps for various operations, such as the SP4_MACH_CRED case for ExchangeID. This patch adds support for operation bitmaps so that support for SP4_MACH_CRED can be added to the NFSv4.1/4.2 server in a future commit. This commit should not change any NFSv4.1/4.2 semantics. MFC after: 3 months
|
#
896516e5 |
|
16-Mar-2023 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfscl: Add a new NFSv4.1/4.2 mount option for Kerberized mounts Without this patch, a Kerberized NFSv4.1/4.2 mount must provide a Kerberos credential for the client at mount time. This credential is typically referred to as a "machine credential". It can be created one of two ways: - The user (usually root) has a valid TGT at the time the mount is done and this becomes the machine credential. There are two problems with this. 1 - The user doing the mount must have a valid TGT for a user principal at mount time. As such, the mount cannot be put in fstab(5) or similar. 2 - When the TGT expires, the mount breaks. - The client machine has a service principal in its default keytab file and this service principal (typically called a host-based initiator credential) is used as the machine credential. There are problems with this approach as well: 1 - There is a certain amount of administrative overhead creating the service principal for the NFS client, creating a keytab entry for this principal and then copying the keytab entry into the client's default keytab file via some secure means. 2 - The NFS client must have a fixed, well known, DNS name, since that FQDN is in the service principal name as the instance. This patch uses a feature of NFSv4.1/4.2 called SP4_NONE, which allows the state maintenance operations to be performed by any authentication mechanism, to do these operations via AUTH_SYS instead of RPCSEC_GSS (Kerberos). As such, neither of the above mechanisms is needed. It is hoped that this option will encourage adoption of Kerberized NFS mounts using TLS, to provide a more secure NFS mount. This new NFSv4.1/4.2 mount option, called "syskrb5" must be used with "sec=krb5[ip]" to avoid the need for either of the above Kerberos setups to be done by the client. Note that all file access/modification operations still require users on the NFS client to have a valid TGT recognized by the NFSv4.1/4.2 server. As such, this option allows, at most, a malicious client to do some sort of DOS attack. Although not required, use of "tls" with this new option is encouraged, since it provides on-the-wire encryption plus, optionally, client identity verification via a X.509 certificate provided to the server during TLS handshake. Alternately, "sec=krb5p" does provide on-the-wire encryption of file data. A mount_nfs(8) man page update will be done in a separate commit. Discussed on: freebsd-current@ MFC after: 3 months
|
#
ef4edb70 |
|
04-May-2022 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Add a sanity check for Owner/OwnerGroup string length Robert Morris reported that, if a client sends an absurdly large Owner/OwnerGroup string, the kernel malloc() for the large size string can block forever. This patch adds a sanity limit for Owner/OwnerGroup string length. Since the RFCs do not specify any limit and FreeBSD can handle a group name greater than 1Kbyte, the limit is set at a generous 10Kbytes. Reported by: rtm@lcs.mit.edu PR: 260546 MFC after: 2 weeks
|
#
ee29e6f3 |
|
16-Jul-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Add sysctl to set maximum I/O size up to 1Mbyte Since MAXPHYS now allows the FreeBSD NFS client to do 1Mbyte I/O operations, add a sysctl called vfs.nfsd.srvmaxio so that the maximum NFS server I/O size can be set up to 1Mbyte. The Linux NFS client can also do 1Mbyte I/O operations. The default of 128Kbytes for the maximum I/O size has not been changed for two reasons: - kern.ipc.maxsockbuf must be increased to support 1Mbyte I/O - The limited benchmarking I can do actually shows a drop in I/O rate when the I/O size is above 256Kbytes. However, daveb@spectralogic.com reports seeing an increase in I/O rate for the 1Mbyte I/O size vs 128Kbytes using a Linux client. Reviewed by: asomers MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D30826
|
#
fc0dc940 |
|
18-May-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: Reduce the callback timeout to 800msec Recent discussion on the nfsv4@ietf.org mailing list confirmed that an NFSv4 server should reply to an RPC in less than 1second. If an NFSv4 RPC requires a delegation be recalled, the server will attempt a CB_RECALL callback. If the client is not responsive, the RPC reply will be delayed until the callback times out. Without this patch, the timeout is set to 4 seconds (set in ticks, but used as seconds), resulting in the RPC reply taking over 4sec. This patch redefines the constant as being in milliseconds and it implements that for a value of 800msec, to ensure the RPC reply is sent in less than 1second. This patch only affects mounts from clients when delegations are enabled on the server and the client is unresponsive to callbacks. MFC after: 2 weeks
|
#
dc78533a |
|
01-Jan-2021 |
Rick Macklem <rmacklem@FreeBSD.org> |
nfsd: fix NFSv4.0 seqid handling for ERELOOKUP Commit 774a36851e0e fixed the NFS server so that it could handle ERELOOKUP returns from VOP calls by redoing the operation/RPC. However, for NFSv4.0, redoing an Open would increment the open_owner's seqid multiple times, breaking the protocol. This patch sets a new flag called ND_ERELOOKUP on the RPC when a redo is in progress. Then the code that increments the seqid avoids the seqid increment/check when the flag is set, since it indicates this has already been done for the Open.
|
#
586ee69f |
|
01-Sep-2020 |
Mateusz Guzik <mjg@FreeBSD.org> |
fs: clean up empty lines in .c and .h files
|
#
02511d21 |
|
10-Aug-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add an argument to newnfs_connect() that indicates use TLS for the connection. For NFSv4.0, the server creates a server->client TCP connection for callbacks. If the client mount on the server is using TLS, enable TLS for this callback TCP connection. TLS connections from clients will not be supported until the kernel RPC changes are committed. Since this changes the internal ABI between the NFS kernel modules that will require a version bump, delete newnfs_trimtrailing(), which is no longer used. Since LCL_TLSCB is not yet set, these changes should not have any semantic affect at this time.
|
#
4476c1de |
|
25-Jun-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add a boolean argument to nfscl_reqstart() to indicate that ext_pgs mbufs should be used. For KERN_TLS (and possibly some other future network interface) the mbuf list passed into sosend() must be ext_pgs mbufs. The krpc could simply copy all the mbuf data into ext_pgs mbufs before calling sosend(), but that would be inefficient for large RPC messages. This patch adds an argument to nfscl_reqstart() to indicate that it should fill the RPC message into ext_pgs mbufs. It also adds fields to "struct nfsrv_descript" needed for building NFS RPC messages in ext_pgs mbufs, along with new flags for this. Since the argument is always "false", this commit should not result in any semantic change. However, this commit prepares the code for future commits that will add support for building of NFS RPC messages in ext_pgs mbufs.
|
#
ae070589 |
|
17-Apr-2020 |
Rick Macklem <rmacklem@FreeBSD.org> |
Replace all instances of the typedef mbuf_t with "struct mbuf *". The typedef mbuf_t was used for the Mac OS/X port of the code long ago. Since this port is no longer used and the use of mbuf_t obscures what the code does (and is not consistent with style(9)), it is no longer needed. This patch replaces all instances of mbuf_t with "struct mbuf *", so that it is no longer used. This patch should not result in any semantic change.
|
#
c057a378 |
|
12-Dec-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for NFSv4.2 to the NFS client and server. This patch adds support for NFSv4.2 (RFC-7862) and Extended Attributes (RFC-8276) to the NFS client and server. NFSv4.2 is comprised of several optional features that can be supported in addition to NFSv4.1. This patch adds the following optional features: - posix_fadvise(POSIX_FADV_WILLNEED/POSIX_FADV_DONTNEED) - posix_fallocate() - intra server file range copying via the copy_file_range(2) syscall --> Avoiding data tranfer over the wire to/from the NFS client. - lseek(SEEK_DATA/SEEK_HOLE) - Extended attribute syscalls for "user" namespace attributes as defined by RFC-8276. Although this patch is fairly large, it should not affect support for the other versions of NFS. However it does add two new sysctls that allow a sysadmin to limit which minor versions of NFSv4 a server supports, allowing a sysadmin to disable NFSv4.2. Unfortunately, when the NFS stats structure was last revised, it was assumed that there would be no additional operations added beyond what was specified in RFC-7862. However RFC-8276 did add additional operations, forcing the NFS stats structure to revised again. It now has extra unused entries in all arrays, so that future extensions to NFSv4.2 can be accomodated without revising this structure again. A future commit will update nfsstat(1) to report counts for the new NFSv4.2 specific operations/procedures. This patch affects the internal interface between the nfscommon, nfscl and nfsd modules and, as such, they all must be upgraded simultaneously. I will do a version bump (although arguably not needed), due to this. This code has survived a "make universe" but has not been built with a recent GCC. If you encounter build problems, please email me. Relnotes: yes
|
#
2096ce03 |
|
06-Dec-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add a couple of definitions for NFSv4.2 and update macros to use them. This patch adds code to macros to clear attribute bits not supported by NFSv4.2. For now, these bits are never set anyhow, but this prepares the code for the addition of NFSv4.2 support in a future commit. There will be a series of these preliminary commits that will prepare for a major commit of the NFSv4.2 client/server changes currently found in subversion under projects/nfsv42/sys.
|
#
e1cda5ee |
|
28-Nov-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix two races while handling nfsuserd daemon start/stop. A crash was reported where the nr_client field was NULL during an upcall to the nfsuserd daemon. Since nr_client == NULL only occurs when the nfsuserd daemon is being shut down, it appeared to be caused by a race between doing an upcall and the daemon shutting down. By inspection two races were identified: 1 - The nfsrv_nfsuserd variable is used to indicate whether or not the daemon is running. However it did not handle the intermediate phase where the daemon is starting or stopping. This was fixed by making nfsrv_nfsuserd tri-state and having the functions that are called during start/stop to obey the intermediate state. 2 - nfsrv_nfsuserd was checked to see that the daemon was running at the beginning of an upcall, but nothing prevented the daemon from being shut down while an upcall was still in progress. This race probably caused the crash. The patch fixes this by adding a count of upcalls in progress and having the shut down function delay until this count goes to zero before getting rid of nr_client and related data used by an upcall. Tested by: avg (Panzura QA) Reported by: avg Reviewed by: avg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22377
|
#
ea5776ec |
|
18-Apr-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the NFSv4.0 server so that it does not support NFSv4.1 attributes. During inspection of a packet trace, I noticed that an NFSv4.0 mount reported that it supported attributes that are only defined for NFSv4.1. In practice, this bug appears to be benign, since NFSv4.0 clients will not use attributes that were added for NFSv4.1. However, this was not correct and this patch fixes the NFSv4.0 server so that it only supports attributes defined for NFSv4.0. It also adds a definition for NFSv4.1 attributes that can only be set, although it is only defined as 0 for now. This is anticipation of the addition of support for the NFSv4.1 mode+mask attribute soon. MFC after: 2 weeks
|
#
80405bcf |
|
06-Apr-2019 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add INET6 support for the upcalls to the nfsuserd daemon. The kernel code uses UDP to do upcalls to the nfsuserd(8) daemon to get updates to the username<->uid and groupname<->gid mappings. A change to AF_LOCAL last year had to be reverted, since it could result in vnode locking issues on the AF_LOCAL socket. This patch adds INET6 support and the required #ifdef INET and INET6 to the code. Requested by: bz PR: 205193 Reviewed by: bz, rgrimes MFC after: 2 weeks Differential Revision: http://reviews.freebsd.org/D19218
|
#
a3e709cd |
|
28-Jul-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the NFSv4.1 server so that it allows ReclaimComplete as done by ESXi 6.7. I believe that a ReclaimComplete with rca_one_fs == TRUE is only to be used after a file system has been transferred to a different file server. However, RFC5661 is somewhat vague w.r.t. this and the ESXi 6.7 client does both a ReclaimComplete with rca_one_fs == TRUE and one with ReclaimComplete with rca_one_fs == FALSE. Therefore, just ignore the rca_one_fs == TRUE operation and return NFS_OK without doing anything instead of replying NFS4ERR_NOTSUPP. This allows the ESXi 6.7 NFSv4.1 client to do a mount. After discussion on the NFSv4 IETF working group mailing list, doing this along with setting a flag to note that a ReclaimComplete with rca_one_fs TRUE was an appropriate way to handle this. The flag that indicates that a ReclaimComplete with rca_one_fs == TRUE was done may be used to disable replies of NFS4ERR_GRACE for non-reclaim state operations in a future commit. This patch along with r332790, r334492 and r336357 allow ESXi 6.7 NFSv4.1 mounts work ok. ESX 6.5 NFSv4.1 mounts do not work well, due to what I believe are violations of RFC-5661 and should not be used. Reported by: andreas.nagy@frequentis.com Tested by: andreas.nagy@frequentis.com, daniel@ftml.net (earlier version) MFC after: 2 weeks Relnotes: yes
|
#
de9a1a70 |
|
09-Jul-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for a "forced" pnfsdskill to the pNFS server kernel code. The pnfsdskill(8) command will normally fail if there is no valid mirror for the DS to be disabled. However, a system administrator may need to disable a DS which does not have a valid mirror so that the nfsd threads can be terminated. This patch adds the kernel code needed by pnfsdskill(8) to implement this "forced" case of disabling a DS. This patch only affects the pNFS server.
|
#
2f32675c |
|
02-Jul-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add an optional feature to the pNFS server. Without this patch, the pNFS server distributes the data storage files across all of the specified DSs. A tester noted that it would be nice if a system administrator could control which DSs are used to store the file data for a given exported MDS file system. This patch adds the kernel support to do this. It also makes a slight semantic change to nfsv4_findmirror(), since some uses of it no longer require that the DS being searched for have a current mirror. A patch that will be committed in a few minutes will modify the nfsd daemon to support this feature. The patch should only affect sites using the pNFS server (specified via the "-p" command line option for nfsd. Suggested by: james.rose@framestore.com
|
#
b18130d3 |
|
22-Jun-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Define ND_HASSLOTID needed by r335568. r335568 uses a flag called ND_HASSLOTID to indicate that the slotid is set, so it can free and invalidate it. This flag needs to be set, which will be done in a subsequent commit. MFC after: 2 weeks
|
#
90d2dfab |
|
12-Jun-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Merge the pNFS server code from projects/pnfs-planb-server into head. This code merge adds a pNFS service to the NFSv4.1 server. Although it is a large commit it should not affect behaviour for a non-pNFS NFS server. Some documentation on how this works can be found at: http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt and will hopefully be turned into a proper document soon. This is a merge of the kernel code. Userland and man page changes will come soon, once the dust settles on this merge. It has passed a "make universe", so I hope it will not cause build problems. It also adds NFSv4.1 server support for the "current stateid". Here is a brief overview of the pNFS service: A pNFS service separates the Read/Write oeprations from all the other NFSv4.1 Metadata operations. It is hoped that this separation allows a pNFS service to be configured that exceeds the limits of a single NFS server for either storage capacity and/or I/O bandwidth. It is possible to configure mirroring within the data servers (DSs) so that the data storage file for an MDS file will be mirrored on two or more of the DSs. When this is used, failure of a DS will not stop the pNFS service and a failed DS can be recovered once repaired while the pNFS service continues to operate. Although two way mirroring would be the norm, it is possible to set a mirroring level of up to four or the number of DSs, whichever is less. The Metadata server will always be a single point of failure, just as a single NFS server is. A Plan B pNFS service consists of a single MetaData Server (MDS) and K Data Servers (DS), all of which are recent FreeBSD systems. Clients will mount the MDS as they would a single NFS server. When files are created, the MDS creates a file tree identical to what a single NFS server creates, except that all the regular (VREG) files will be empty. As such, if you look at the exported tree on the MDS directly on the MDS server (not via an NFS mount), the files will all be of size 0. Each of these files will also have two extended attributes in the system attribute name space: pnfsd.dsfile - This extended attrbute stores the information that the MDS needs to find the data storage file(s) on DS(s) for this file. pnfsd.dsattr - This extended attribute stores the Size, AccessTime, ModifyTime and Change attributes for the file, so that the MDS doesn't need to acquire the attributes from the DS for every Getattr operation. For each regular (VREG) file, the MDS creates a data storage file on one (or more if mirroring is enabled) of the DSs in one of the "dsNN" subdirectories. The name of this file is the file handle of the file on the MDS in hexadecimal so that the name is unique. The DSs use subdirectories named "ds0" to "dsN" so that no one directory gets too large. The value of "N" is set via the sysctl vfs.nfsd.dsdirsize on the MDS, with the default being 20. For production servers that will store a lot of files, this value should probably be much larger. It can be increased when the "nfsd" daemon is not running on the MDS, once the "dsK" directories are created. For pNFS aware NFSv4.1 clients, the FreeBSD server will return two pieces of information to the client that allows it to do I/O directly to the DS. DeviceInfo - This is relatively static information that defines what a DS is. The critical bits of information returned by the FreeBSD server is the IP address of the DS and, for the Flexible File layout, that NFSv4.1 is to be used and that it is "tightly coupled". There is a "deviceid" which identifies the DeviceInfo. Layout - This is per file and can be recalled by the server when it is no longer valid. For the FreeBSD server, there is support for two types of layout, call File and Flexible File layout. Both allow the client to do I/O on the DS via NFSv4.1 I/O operations. The Flexible File layout is a more recent variant that allows specification of mirrors, where the client is expected to do writes to all mirrors to maintain them in a consistent state. The Flexible File layout also allows the client to report I/O errors for a DS back to the MDS. The Flexible File layout supports two variants referred to as "tightly coupled" vs "loosely coupled". The FreeBSD server always uses the "tightly coupled" variant where the client uses the same credentials to do I/O on the DS as it would on the MDS. For the "loosely coupled" variant, the layout specifies a synthetic user/group that the client uses to do I/O on the DS. The FreeBSD server does not do striping and always returns layouts for the entire file. The critical information in a layout is Read vs Read/Writea and DeviceID(s) that identify which DS(s) the data is stored on. At this time, the MDS generates File Layout layouts to NFSv4.1 clients that know how to do pNFS for the non-mirrored DS case unless the sysctl vfs.nfsd.default_flexfile is set non-zero, in which case Flexible File layouts are generated. The mirrored DS configuration always generates Flexible File layouts. For NFS clients that do not support NFSv4.1 pNFS, all I/O operations are done against the MDS which acts as a proxy for the appropriate DS(s). When the MDS receives an I/O RPC, it will do the RPC on the DS as a proxy. If the DS is on the same machine, the MDS/DS will do the RPC on the DS as a proxy and so on, until the machine runs out of some resource, such as session slots or mbufs. As such, DSs must be separate systems from the MDS. Tested by: james.rose@framestore.com Relnotes: yes
|
#
d506aa14 |
|
09-Jun-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Delete an unused macro and clean up a comment about it. NFSDEV_MIRRORSTR was defined for the pNFS server, but has not been used, so this patch deletes it. It also cleans up the comment and hopefully makes it more readable.
|
#
9442a64e |
|
01-Jun-2018 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add the BindConnectiontoSession operation to the NFSv4.1 server. Under some fairly unusual circumstances, the Linux NFSv4.1 client is doing a BindConnectiontoSession operation for TCP connections. It is also used by the ESXi6.5 NFSv4.1 client. This patch adds this operation to the NFSv4.1 server. Reported by: andreas.nagy@frequentis.com Tested by: andreas.nagy@frequentis.com MFC after: 2 weeks
|
#
51369649 |
|
20-Nov-2017 |
Pedro F. Giffuni <pfg@FreeBSD.org> |
sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.
|
#
6b43e060 |
|
20-Sep-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add a few definitions for Flex File Layout for pNFS. These definitions will be used by a future commit.
|
#
95ac7f1a |
|
18-Jun-2017 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the NFS client/server so that it actually uses the 64bit ino_t filenos. The code still doesn't use d_off. That will come in a future commit. The code also removes the checks for servers returning a fileno that doesn't fit in 32bits, since that should work ok now. Bump __FreeBSD_version since this patch changes the interface between the NFS kernel modules. Reviewed by: kib
|
#
fbbd9655 |
|
28-Feb-2017 |
Warner Losh <imp@FreeBSD.org> |
Renumber copyright clause 4 Renumber cluase 4 to 3, per what everybody else did when BSD granted them permission to remove clause 3. My insistance on keeping the same numbering for legal reasons is too pedantic, so give up on that point. Submitted by: Jan Schaumann <jschauma@stevens.edu> Pull Request: https://github.com/freebsd/freebsd/pull/96
|
#
b2fc0141 |
|
23-Dec-2016 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix NFSv4.1 client recovery from NFS4ERR_BAD_SESSION errors. For most NFSv4.1 servers, a NFS4ERR_BAD_SESSION error is a rare failure that indicates that the server has lost session/open/lock state. However, recent testing by cperciva@ against the AmazonEFS server found several problems with client recovery from this due to it generating this failure frequently. Briefly, the problems fixed are: - If all session slots were in use at the time of the failure, some processes would continue to loop waiting for a slot on the old session forever. - If an RPC that doesn't use open/lock state failed with NFS4ERR_BAD_SESSION, it would fail the RPC/syscall instead of initiating recovery and then looping to retry the RPC. - If a successful reply to an RPC for an old session wasn't processed until after a new session was created for a NFS4ERR_BAD_SESSION error, it would erroneously update the new session and corrupt it. - The use of the first element of the session list in the nfs mount structure (which is always the current metadata session) was slightly racey. With changes for the above problems it became more racey, so all uses of this head pointer was wrapped with a NFSLOCKMNT()/NFSUNLOCKMNT(). - Although the kernel malloc() usually allocates more bytes than requested and, as such, this wouldn't have caused problems, the allocation of a session structure was 1 byte smaller than it should have been. (Null termination byte for the string not included in byte count.) There are probably still problems with a pNFS data server that fails with NFS4ERR_BAD_SESSION, but I have no server that does this to test against (the AmazonEFS server doesn't do pNFS), so I can't fix these yet. Although this patch is fairly large, it should only affect the handling of NFS4ERR_BAD_SESSION error replies from an NFSv4.1 server. Thanks go to cperciva@ for the extension testing he did to help isolate/fix these problems. Reported by: cperciva Tested by: cperciva MFC after: 3 months Differential Revision: https://reviews.freebsd.org/D8745
|
#
84be7e09 |
|
30-Nov-2015 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add kernel support to the NFS server for the "-manage-gids" option that will be added to the nfsuserd daemon in a future commit. It modifies the cache used by NFSv4 for name<-->id translation (both username/uid and group/gid) to support this. When "-manage-gids" is set, the server looks up each uid for the RPC and uses the list of groups cached in the server instead of the list of groups provided in the RPC request. The cached group list is acquired for the cache by the nfsuserd daemon via getgrouplist(3). This avoids the 16 groups limit for the list in the RPC request. Since the cache is now used for every RPC when "-manage-gids" is enabled, the code also modifies the cache to use a separate mutex for each hash list instead of a single global mutex. Suggested by: jpaetzel Tested by: jpaetzel MFC after: 2 weeks
|
#
1f54e596 |
|
27-May-2015 |
Rick Macklem <rmacklem@FreeBSD.org> |
Make the size of the hash tables used by the NFSv4 server tunable. No appreciable change in performance was observed after increasing the sizes of these tables and then testing with a single client. However, there was an email that indicated high CPU overheads for a heavily loaded NFSv4 and it is hoped that increasing the sizes of the hash tables via these tunables might help. The tables remain the same size by default. Differential Revision: https://reviews.freebsd.org/D2596 MFC after: 2 weeks
|
#
66e80f77 |
|
16-Apr-2015 |
Rick Macklem <rmacklem@FreeBSD.org> |
mav@ has found that NFS servers exporting ZFS file systems can perform better when using a 128K read/write data size. This patch changes NFS_MAXDATA from 64K to 128K so that clients can use 128K for NFS mounts to allow this. The patch also renames NFS_MAXDATA to NFS_SRVMAXIO so that it is clear that it applies to the NFS server side only. It also avoids a name conflict with the NFS_MAXDATA defined in rpcsvc/nfs_prot.h, that is used for userland RPC. Tested by: mav Reviewed by: mav MFC after: 2 weeks
|
#
c59e4cc3 |
|
01-Jul-2014 |
Rick Macklem <rmacklem@FreeBSD.org> |
Merge the NFSv4.1 server code in projects/nfsv4.1-server over into head. The code is not believed to have any effect on the semantics of non-NFSv4.1 server behaviour. It is a rather large merge, but I am hoping that there will not be any regressions for the NFS server. MFC after: 1 month
|
#
fd77bbb9 |
|
26-Aug-2013 |
John Baldwin <jhb@FreeBSD.org> |
Remove most of the remaining sysctl name list macros. They were only ever intended for use in sysctl(8) and it has not used them for many years. Reviewed by: bde Tested by: exp-run by bdrewery
|
#
88a2437a |
|
08-Jul-2013 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add support for host-based (Kerberos 5 service principal) initiator credentials to the kernel rpc. Modify the NFSv4 client to add support for the gssname and allgssname mount options to use this capability. Requires the gssd daemon to be running with the "-h" option. Reviewed by: jhb
|
#
a89a2c8b |
|
25-Jan-2013 |
John Baldwin <jhb@FreeBSD.org> |
Further cleanups to use of timestamps in NFS: - Use NFSD_MONOSEC (which maps to time_uptime) instead of the seconds portion of wall-time stamps to manage timeouts on events. - Remove unused nd_starttime from the per-request structure in the new NFS server. - Use nanotime() for the modification time on a delegation to get as precise a time as possible. - Use time_second instead of extracting the second from a call to getmicrotime(). Submitted by: bde (3) Reviewed by: bde, rmacklem MFC after: 2 weeks
|
#
1f60bfd8 |
|
08-Dec-2012 |
Rick Macklem <rmacklem@FreeBSD.org> |
Move the NFSv4.1 client patches over from projects/nfsv4.1-client to head. I don't think the NFS client behaviour will change unless the new "minorversion=1" mount option is used. It includes basic NFSv4.1 support plus support for pNFS using the Files Layout only. All problems detecting during an NFSv4.1 Bakeathon testing event in June 2012 have been resolved in this code and it has been tested against the NFSv4.1 server available to me. Although not reviewed, I believe that kib@ has looked at it.
|
#
c52005a3 |
|
19-Sep-2012 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the NFSv4 client so that it can handle owner and owner_group strings that consist entirely of digits, interpreting them as the uid/gid number. This change was needed since new (>= 3.3) Linux servers reply with these strings by default. This change is mandated by the rfc3530bis draft. Reported on freebsd-stable@ under the Subject heading "Problem with Linux >= 3.3 as NFSv4 server" by Norbert Aschendorff on Aug. 20, 2012. Tested by: norbert.aschendorff at yahoo.de Reviewed by: jhb MFC after: 2 weeks
|
#
08aadbe3 |
|
06-May-2011 |
Alexander Motin <mav@FreeBSD.org> |
Increase NFS_TICKINTVL value from 10 to 500. Now that callout does useful things only once per second, so other 99 calls per second were useless and just don't allow idle system to sleep properly. Reviewed by: rmacklem
|
#
8e82d541 |
|
17-Apr-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Change some defaults in the experimental NFS client to be the same as the regular NFS client for NFSv3. The main one is making use of a reserved port# the default. Also, set the retry limit for TCP the same and fix the code so that it doesn't disable readdirplus for NFSv4. MFC after: 2 weeks
|
#
7b8c319b |
|
15-Apr-2011 |
Rick Macklem <rmacklem@FreeBSD.org> |
Change the experimental NFS client so that it creates nfsiod threads in the same manner as the regular NFS client after r214026 was committed. This resolves the lors fixed by r214026 and its predecessors for the regular client. Reviewed by: jhb MFC after: 2 weeks
|
#
17891d00 |
|
25-Dec-2010 |
Rick Macklem <rmacklem@FreeBSD.org> |
Modify the experimental NFS server so that it uses LK_SHARED for RPC operations when it can. Since VFS_FHTOVP() currently always gets an exclusively locked vnode and is usually called at the beginning of each RPC, the RPCs for a given vnode will still be serialized. As such, passing a lock type argument to VFS_FHTOVP() would be preferable to doing the vn_lock() with LK_DOWNGRADE after the VFS_FHTOVP() call. Reviewed by: kib MFC after: 2 weeks
|
#
a7d5f7eb |
|
19-Oct-2010 |
Jamie Gritton <jamie@FreeBSD.org> |
A new jail(8) with a configuration file, to replace the work currently done by /etc/rc.d/jail.
|
#
65cc6600 |
|
20-Jun-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Replace RPCAUTH_UNIXGIDS with NFS_MAXGRPS so that nfscbd.c will build. Approved by: kib (mentor)
|
#
2c1e6cce |
|
19-Jun-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Change the size of the nfsc_groups[] array in the experimental nfs client to RPCAUTH_UNIXGIDS + 1 (17), since that is what can go on the wire for AUTH_SYS authentication. Reviewed by: brooks Approved by: kib (mentor)
|
#
79512355 |
|
22-May-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Fix the comment in sys/fs/nfs/nfs.h to correctly reflect the current use of the R_xxx flags. This changed when the NFS_LEGACYRPC code was removed from the subsystem. Approved by: kib (mentor)
|
#
0c794c5d |
|
15-May-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Move the nfsstat structure and proc/op number definitions on the experimental nfs subsystem from sys/fs/nfs/nfs.h and sys/fs/nfs/nfsproto.h to sys/fs/nfs/nfsport.h and rename nfsstat to ext_nfsstat. This was done so that src/usr.bin/nfsstat.c could use it alongside the regular nfs include files and struct nfsstat. Approved by: kib (mentor)
|
#
98ad4453 |
|
14-May-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Apply changes to the experimental nfs server so that it uses the security flavors as exported in FreeBSD-CURRENT. This allows it to use a slightly modified mountd.c instead of a different utility. Approved by: kib (mentor)
|
#
1c6c0ed9 |
|
11-May-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Change the name of the nfs server addsock structure from nfsd_args to nfsd_addsock_args, so that it is consistent with the one in sys/nfsserver/nfs.h. Approved by: kib (mentor)
|
#
9ec7b004 |
|
04-May-2009 |
Rick Macklem <rmacklem@FreeBSD.org> |
Add the experimental nfs subtree to the kernel, that includes support for NFSv4 as well as NFSv2 and 3. It lives in 3 subdirs under sys/fs: nfs - functions that are common to the client and server nfsclient - a mutation of sys/nfsclient that call generic functions to do RPCs and handle state. As such, it retains the buffer cache handling characteristics and vnode semantics that are found in sys/nfsclient, for the most part. nfsserver - the server. It includes a DRC designed specifically for NFSv4, that is used instead of the generic DRC in sys/rpc. The build glue will be checked in later, so at this point, it consists of 3 new subdirs that should not affect kernel building. Approved by: kib (mentor)
|