8.de BT 9.if \\n%=1 .tl ''- % -'' 10.. 11.ND 12.\" prevent excess underlining in nroff 13.if n .fp 2 R 14.OH 'Network File System: Version 2 Protocol Specification''Page %' 15.EH 'Page %''Network File System: Version 2 Protocol Specification' 16.if \n%=1 .bp 17.SH 18\&Network File System: Version 2 Protocol Specification 19.IX NFS "" "" "" PAGE MAJOR 20.IX "Network File System" "" "" "" PAGE MAJOR 21.IX NFS "version-2 protocol specification" 22.IX "Network File System" "version-2 protocol specification" 23.LP 24.NH 0 25\&Status of this Standard 26.LP 27Note: This document specifies a protocol that Sun Microsystems, Inc., 28and others are using. It specifies it in standard ARPA RFC form. 29.NH 1 30\&Introduction 31.IX NFS introduction 32.LP 33The Sun Network Filesystem (NFS) protocol provides transparent remote 34access to shared filesystems over local area networks. The NFS 35protocol is designed to be machine, operating system, network architecture, 36and transport protocol independent. This independence is 37achieved through the use of Remote Procedure Call (RPC) primitives 38built on top of an External Data Representation (XDR). Implementations 39exist for a variety of machines, from personal computers to 40supercomputers. 41.LP 42The supporting mount protocol allows the server to hand out remote 43access privileges to a restricted set of clients. It performs the 44operating system-specific functions that allow, for example, to 45attach remote directory trees to some local file system. 46.NH 2 47\&Remote Procedure Call 48.IX "Remote Procedure Call" 49.LP 50Sun's remote procedure call specification provides a procedure- 51oriented interface to remote services. Each server supplies a 52program that is a set of procedures. NFS is one such "program". 53The combination of host address, program number, and procedure 54number specifies one remote service procedure. RPC does not depend 55on services provided by specific protocols, so it can be used with 56any underlying transport protocol. See the 57.I "Remote Procedure Calls: Protocol Specification" 58chapter of this manual. 59.NH 2 60\&External Data Representation 61.IX "External Data Representation" 62.LP 63The External Data Representation (XDR) standard provides a common 64way of representing a set of data types over a network. 65The NFS 66Protocol Specification is written using the RPC data description 67language. 68For more information, see the 69.I " External Data Representation Standard: Protocol Specification." 70Sun provides implementations of XDR and 71RPC, but NFS does not require their use. Any software that 72provides equivalent functionality can be used, and if the encoding 73is exactly the same it can interoperate with other implementations 74of NFS. 75.NH 2 76\&Stateless Servers 77.IX "stateless servers" 78.IX servers stateless 79.LP 80The NFS protocol is stateless. That is, a server does not need to 81maintain any extra state information about any of its clients in 82order to function correctly. Stateless servers have a distinct 83advantage over stateful servers in the event of a failure. With 84stateless servers, a client need only retry a request until the 85server responds; it does not even need to know that the server has 86crashed, or the network temporarily went down. The client of a 87stateful server, on the other hand, needs to either detect a server 88crash and rebuild the server's state when it comes back up, or 89cause client operations to fail. 90.LP 91This may not sound like an important issue, but it affects the 92protocol in some unexpected ways. We feel that it is worth a bit 93of extra complexity in the protocol to be able to write very simple 94servers that do not require fancy crash recovery. 95.LP 96On the other hand, NFS deals with objects such as files and 97directories that inherently have state -- what good would a file be 98if it did not keep its contents intact? The goal is to not 99introduce any extra state in the protocol itself. Another way to 100simplify recovery is by making operations "idempotent" whenever 101possible (so that they can potentially be repeated). 102.NH 1 103\&NFS Protocol Definition 104.IX NFS "protocol definition" 105.IX NFS protocol 106.LP 107Servers have been known to change over time, and so can the 108protocol that they use. So RPC provides a version number with each 109RPC request. This RFC describes version two of the NFS protocol. 110Even in the second version, there are various obsolete procedures 111and parameters, which will be removed in later versions. An RFC 112for version three of the NFS protocol is currently under 113preparation. 114.NH 2 115\&File System Model 116.IX filesystem model 117.LP 118NFS assumes a file system that is hierarchical, with directories as 119all but the bottom-level files. Each entry in a directory (file, 120directory, device, etc.) has a string name. Different operating 121systems may have restrictions on the depth of the tree or the names 122used, as well as using different syntax to represent the "pathname", 123which is the concatenation of all the "components" (directory and 124file names) in the name. A "file system" is a tree on a single 125server (usually a single disk or physical partition) with a specified 126"root". Some operating systems provide a "mount" operation to make 127all file systems appear as a single tree, while others maintain a 128"forest" of file systems. Files are unstructured streams of 129uninterpreted bytes. Version 3 of NFS uses a slightly more general 130file system model. 131.LP 132NFS looks up one component of a pathname at a time. It may not be 133obvious why it does not just take the whole pathname, traipse down 134the directories, and return a file handle when it is done. There are 135several good reasons not to do this. First, pathnames need 136separators between the directory components, and different operating 137systems use different separators. We could define a Network Standard 138Pathname Representation, but then every pathname would have to be 139parsed and converted at each end. Other issues are discussed in 140\fINFS Implementation Issues\fP below. 141.LP 142Although files and directories are similar objects in many ways, 143different procedures are used to read directories and files. This 144provides a network standard format for representing directories. The 145same argument as above could have been used to justify a procedure 146that returns only one directory entry per call. The problem is 147efficiency. Directories can contain many entries, and a remote call 148to return each would be just too slow. 149.NH 2 150\&RPC Information 151.IX NFS "RPC information" 152.IP \fIAuthentication\fP 153The NFS service uses 154.I AUTH_UNIX , 155.I AUTH_DES , 156or 157.I AUTH_SHORT 158style 159authentication, except in the NULL procedure where 160.I AUTH_NONE 161is also allowed. 162.IP "\fITransport Protocols\fP" 163NFS currently is supported on UDP/IP only. 164.IP "\fIPort Number\fP" 165The NFS protocol currently uses the UDP port number 2049. This is 166not an officially assigned port, so later versions of the protocol 167use the \*QPortmapping\*U facility of RPC. 168.NH 2 169\&Sizes of XDR Structures 170.IX "XDR structure sizes" 171.LP 172These are the sizes, given in decimal bytes, of various XDR 173structures used in the protocol: 174.DS 175/* \fIThe maximum number of bytes of data in a READ or WRITE request\fP */ 176const MAXDATA = 8192; 177 178/* \fIThe maximum number of bytes in a pathname argument\fP */ 179const MAXPATHLEN = 1024; 180 181/* \fIThe maximum number of bytes in a file name argument\fP */ 182const MAXNAMLEN = 255; 183 184/* \fIThe size in bytes of the opaque "cookie" passed by READDIR\fP */ 185const COOKIESIZE = 4; 186 187/* \fIThe size in bytes of the opaque file handle\fP */ 188const FHSIZE = 32; 189.DE 190.NH 2 191\&Basic Data Types 192.IX "NFS data types" 193.IX NFS "basic data types" 194.LP 195The following XDR definitions are basic structures and types used 196in other structures described further on. 197.KS 198.NH 3 199\&stat 200.IX "NFS data types" stat "" \fIstat\fP 201.DS 202enum stat { 203 NFS_OK = 0, 204 NFSERR_PERM=1, 205 NFSERR_NOENT=2, 206 NFSERR_IO=5, 207 NFSERR_NXIO=6, 208 NFSERR_ACCES=13, 209 NFSERR_EXIST=17, 210 NFSERR_NODEV=19, 211 NFSERR_NOTDIR=20, 212 NFSERR_ISDIR=21, 213 NFSERR_FBIG=27, 214 NFSERR_NOSPC=28, 215 NFSERR_ROFS=30, 216 NFSERR_NAMETOOLONG=63, 217 NFSERR_NOTEMPTY=66, 218 NFSERR_DQUOT=69, 219 NFSERR_STALE=70, 220 NFSERR_WFLUSH=99 221}; 222.DE 223.KE 224.LP 225The 226.I stat 227type is returned with every procedure's results. A 228value of 229.I NFS_OK 230indicates that the call completed successfully and 231the results are valid. The other values indicate some kind of 232error occurred on the server side during the servicing of the 233procedure. The error values are derived from UNIX error numbers. 234.IP \fBNFSERR_PERM\fP: 235Not owner. The caller does not have correct ownership 236to perform the requested operation. 237.IP \fBNFSERR_NOENT\fP: 238No such file or directory. The file or directory 239specified does not exist. 240.IP \fBNFSERR_IO\fP: 241Some sort of hard error occurred when the operation was 242in progress. This could be a disk error, for example. 243.IP \fBNFSERR_NXIO\fP: 244No such device or address. 245.IP \fBNFSERR_ACCES\fP: 246Permission denied. The caller does not have the 247correct permission to perform the requested operation. 248.IP \fBNFSERR_EXIST\fP: 249File exists. The file specified already exists. 250.IP \fBNFSERR_NODEV\fP: 251No such device. 252.IP \fBNFSERR_NOTDIR\fP: 253Not a directory. The caller specified a 254non-directory in a directory operation. 255.IP \fBNFSERR_ISDIR\fP: 256Is a directory. The caller specified a directory in 257a non- directory operation. 258.IP \fBNFSERR_FBIG\fP: 259File too large. The operation caused a file to grow 260beyond the server's limit. 261.IP \fBNFSERR_NOSPC\fP: 262No space left on device. The operation caused the 263server's filesystem to reach its limit. 264.IP \fBNFSERR_ROFS\fP: 265Read-only filesystem. Write attempted on a read-only filesystem. 266.IP \fBNFSERR_NAMETOOLONG\fP: 267File name too long. The file name in an operation was too long. 268.IP \fBNFSERR_NOTEMPTY\fP: 269Directory not empty. Attempted to remove a 270directory that was not empty. 271.IP \fBNFSERR_DQUOT\fP: 272Disk quota exceeded. The client's disk quota on the 273server has been exceeded. 274.IP \fBNFSERR_STALE\fP: 275The "fhandle" given in the arguments was invalid. 276That is, the file referred to by that file handle no longer exists, 277or access to it has been revoked. 278.IP \fBNFSERR_WFLUSH\fP: 279The server's write cache used in the 280.I WRITECACHE 281call got flushed to disk. 282.LP 283.KS 284.NH 3 285\&ftype 286.IX "NFS data types" ftype "" \fIftype\fP 287.DS 288enum ftype { 289 NFNON = 0, 290 NFREG = 1, 291 NFDIR = 2, 292 NFBLK = 3, 293 NFCHR = 4, 294 NFLNK = 5 295}; 296.DE 297.KE 298The enumeration 299.I ftype 300gives the type of a file. The type 301.I NFNON 302indicates a non-file, 303.I NFREG 304is a regular file, 305.I NFDIR 306is a directory, 307.I NFBLK 308is a block-special device, 309.I NFCHR 310is a character-special device, and 311.I NFLNK 312is a symbolic link. 313.KS 314.NH 3 315\&fhandle 316.IX "NFS data types" fhandle "" \fIfhandle\fP 317.DS 318typedef opaque fhandle[FHSIZE]; 319.DE 320.KE 321The 322.I fhandle 323is the file handle passed between the server and the client. 324All file operations are done using file handles to refer to a file or 325directory. The file handle can contain whatever information the server 326needs to distinguish an individual file. 327.KS 328.NH 3 329\&timeval 330.IX "NFS data types" timeval "" \fItimeval\fP 331.DS 332struct timeval { 333 unsigned int seconds; 334 unsigned int useconds; 335}; 336.DE 337.KE 338The 339.I timeval 340structure is the number of seconds and microseconds 341since midnight January 1, 1970, Greenwich Mean Time. It is used to 342pass time and date information. 343.KS 344.NH 3 345\&fattr 346.IX "NFS data types" fattr "" \fIfattr\fP 347.DS 348struct fattr { 349 ftype type; 350 unsigned int mode; 351 unsigned int nlink; 352 unsigned int uid; 353 unsigned int gid; 354 unsigned int size; 355 unsigned int blocksize; 356 unsigned int rdev; 357 unsigned int blocks; 358 unsigned int fsid; 359 unsigned int fileid; 360 timeval atime; 361 timeval mtime; 362 timeval ctime; 363}; 364.DE 365.KE 366The 367.I fattr 368structure contains the attributes of a file; "type" is the type of 369the file; "nlink" is the number of hard links to the file (the number 370of different names for the same file); "uid" is the user 371identification number of the owner of the file; "gid" is the group 372identification number of the group of the file; "size" is the size in 373bytes of the file; "blocksize" is the size in bytes of a block of the 374file; "rdev" is the device number of the file if it is type 375.I NFCHR 376or 377.I NFBLK ; 378"blocks" is the number of blocks the file takes up on disk; "fsid" is 379the file system identifier for the filesystem containing the file; 380"fileid" is a number that uniquely identifies the file within its 381filesystem; "atime" is the time when the file was last accessed for 382either read or write; "mtime" is the time when the file data was last 383modified (written); and "ctime" is the time when the status of the 384file was last changed. Writing to the file also changes "ctime" if 385the size of the file changes. 386.LP 387"mode" is the access mode encoded as a set of bits. Notice that the 388file type is specified both in the mode bits and in the file type. 389This is really a bug in the protocol and will be fixed in future 390versions. The descriptions given below specify the bit positions 391using octal numbers. 392.TS 393box tab (&) ; 394cfI cfI 395lfL l . 396Bit&Description 397_ 3980040000&This is a directory; "type" field should be NFDIR. 3990020000&This is a character special file; "type" field should be NFCHR. 4000060000&This is a block special file; "type" field should be NFBLK. 4010100000&This is a regular file; "type" field should be NFREG. 4020120000&This is a symbolic link file; "type" field should be NFLNK. 4030140000&This is a named socket; "type" field should be NFNON. 4040004000&Set user id on execution. 4050002000&Set group id on execution. 4060001000&Save swapped text even after use. 4070000400&Read permission for owner. 4080000200&Write permission for owner. 4090000100&Execute and search permission for owner. 4100000040&Read permission for group. 4110000020&Write permission for group. 4120000010&Execute and search permission for group. 4130000004&Read permission for others. 4140000002&Write permission for others. 4150000001&Execute and search permission for others. 416.TE 417.KS 418Notes: 419.IP 420The bits are the same as the mode bits returned by the 421.I stat(2) 422system call in the UNIX system. The file type is specified both in 423the mode bits and in the file type. This is fixed in future 424versions. 425.IP 426The "rdev" field in the attributes structure is an operating system 427specific device specifier. It will be removed and generalized in 428the next revision of the protocol. 429.KE 430.LP 431.KS 432.NH 3 433\&sattr 434.IX "NFS data types" sattr "" \fIsattr\fP 435.DS 436struct sattr { 437 unsigned int mode; 438 unsigned int uid; 439 unsigned int gid; 440 unsigned int size; 441 timeval atime; 442 timeval mtime; 443}; 444.DE 445.KE 446The 447.I sattr 448structure contains the file attributes which can be set 449from the client. The fields are the same as for 450.I fattr 451above. A "size" of zero means the file should be truncated. 452A value of -1 indicates a field that should be ignored. 453.LP 454.KS 455.NH 3 456\&filename 457.IX "NFS data types" filename "" \fIfilename\fP 458.DS 459typedef string filename<MAXNAMLEN>; 460.DE 461.KE 462The type 463.I filename 464is used for passing file names or pathname components. 465.LP 466.KS 467.NH 3 468\&path 469.IX "NFS data types" path "" \fIpath\fP 470.DS 471typedef string path<MAXPATHLEN>; 472.DE 473.KE 474The type 475.I path 476is a pathname. The server considers it as a string 477with no internal structure, but to the client it is the name of a 478node in a filesystem tree. 479.LP 480.KS 481.NH 3 482\&attrstat 483.IX "NFS data types" attrstat "" \fIattrstat\fP 484.DS 485union attrstat switch (stat status) { 486 case NFS_OK: 487 fattr attributes; 488 default: 489 void; 490}; 491.DE 492.KE 493The 494.I attrstat 495structure is a common procedure result. It contains 496a "status" and, if the call succeeded, it also contains the 497attributes of the file on which the operation was done. 498.LP 499.KS 500.NH 3 501\&diropargs 502.IX "NFS data types" diropargs "" \fIdiropargs\fP 503.DS 504struct diropargs { 505 fhandle dir; 506 filename name; 507}; 508.DE 509.KE 510The 511.I diropargs 512structure is used in directory operations. The 513"fhandle" "dir" is the directory in which to find the file "name". 514A directory operation is one in which the directory is affected. 515.LP 516.KS 517.NH 3 518\&diropres 519.IX "NFS data types" diropres "" \fIdiropres\fP 520.DS 521union diropres switch (stat status) { 522 case NFS_OK: 523 struct { 524 fhandle file; 525 fattr attributes; 526 } diropok; 527 default: 528 void; 529}; 530.DE 531.KE 532The results of a directory operation are returned in a 533.I diropres 534structure. If the call succeeded, a new file handle "file" and the 535"attributes" associated with that file are returned along with the 536"status". 537.NH 2 538\&Server Procedures 539.IX "NFS server procedures" "" "" "" PAGE MAJOR 540.LP 541The protocol definition is given as a set of procedures with 542arguments and results defined using the RPC language. A brief 543description of the function of each procedure should provide enough 544information to allow implementation. 545.LP 546All of the procedures in the NFS protocol are assumed to be 547synchronous. When a procedure returns to the client, the client 548can assume that the operation has completed and any data associated 549with the request is now on stable storage. For example, a client 550.I WRITE 551request may cause the server to update data blocks, 552filesystem information blocks (such as indirect blocks), and file 553attribute information (size and modify times). When the 554.I WRITE 555returns to the client, it can assume that the write is safe, even 556in case of a server crash, and it can discard the data written. 557This is a very important part of the statelessness of the server. 558If the server waited to flush data from remote requests, the client 559would have to save those requests so that it could resend them in 560case of a server crash. 561.ie t .DS 562.el .DS L 563 564.ft I 565/* 566* Remote file service routines 567*/ 568.ft CW 569program NFS_PROGRAM { 570 version NFS_VERSION { 571 void NFSPROC_NULL(void) = 0; 572 attrstat NFSPROC_GETATTR(fhandle) = 1; 573 attrstat NFSPROC_SETATTR(sattrargs) = 2; 574 void NFSPROC_ROOT(void) = 3; 575 diropres NFSPROC_LOOKUP(diropargs) = 4; 576 readlinkres NFSPROC_READLINK(fhandle) = 5; 577 readres NFSPROC_READ(readargs) = 6; 578 void NFSPROC_WRITECACHE(void) = 7; 579 attrstat NFSPROC_WRITE(writeargs) = 8; 580 diropres NFSPROC_CREATE(createargs) = 9; 581 stat NFSPROC_REMOVE(diropargs) = 10; 582 stat NFSPROC_RENAME(renameargs) = 11; 583 stat NFSPROC_LINK(linkargs) = 12; 584 stat NFSPROC_SYMLINK(symlinkargs) = 13; 585 diropres NFSPROC_MKDIR(createargs) = 14; 586 stat NFSPROC_RMDIR(diropargs) = 15; 587 readdirres NFSPROC_READDIR(readdirargs) = 16; 588 statfsres NFSPROC_STATFS(fhandle) = 17; 589 } = 2; 590} = 100003; 591.DE 592.KS 593.NH 3 594\&Do Nothing 595.IX "NFS server procedures" NFSPROC_NULL() "" \fINFSPROC_NULL()\fP 596.DS 597void 598NFSPROC_NULL(void) = 0; 599.DE 600.KE 601This procedure does no work. It is made available in all RPC 602services to allow server response testing and timing. 603.KS 604.NH 3 605\&Get File Attributes 606.IX "NFS server procedures" NFSPROC_GETATTR() "" \fINFSPROC_GETATTR()\fP 607.DS 608attrstat 609NFSPROC_GETATTR (fhandle) = 1; 610.DE 611.KE 612If the reply status is 613.I NFS_OK , 614then the reply attributes contains 615the attributes for the file given by the input fhandle. 616.KS 617.NH 3 618\&Set File Attributes 619.IX "NFS server procedures" NFSPROC_SETATTR() "" \fINFSPROC_SETATTR()\fP 620.DS 621struct sattrargs { 622 fhandle file; 623 sattr attributes; 624 }; 625 626attrstat 627NFSPROC_SETATTR (sattrargs) = 2; 628.DE 629.KE 630The "attributes" argument contains fields which are either -1 or 631are the new value for the attributes of "file". If the reply 632status is 633.I NFS_OK , 634then the reply attributes have the attributes of 635the file after the "SETATTR" operation has completed. 636.LP 637Note: The use of -1 to indicate an unused field in "attributes" is 638changed in the next version of the protocol. 639.KS 640.NH 3 641\&Get Filesystem Root 642.IX "NFS server procedures" NFSPROC_ROOT "" \fINFSPROC_ROOT\fP 643.DS 644void 645NFSPROC_ROOT(void) = 3; 646.DE 647.KE 648Obsolete. This procedure is no longer used because finding the 649root file handle of a filesystem requires moving pathnames between 650client and server. To do this right we would have to define a 651network standard representation of pathnames. Instead, the 652function of looking up the root file handle is done by the 653.I MNTPROC_MNT() 654procedure. (See the 655.I "Mount Protocol Definition" 656later in this chapter for details). 657.KS 658.NH 3 659\&Look Up File Name 660.IX "NFS server procedures" NFSPROC_LOOKUP() "" \fINFSPROC_LOOKUP()\fP 661.DS 662diropres 663NFSPROC_LOOKUP(diropargs) = 4; 664.DE 665.KE 666If the reply "status" is 667.I NFS_OK , 668then the reply "file" and reply 669"attributes" are the file handle and attributes for the file "name" 670in the directory given by "dir" in the argument. 671.KS 672.NH 3 673\&Read From Symbolic Link 674.IX "NFS server procedures" NFSPROC_READLINK() "" \fINFSPROC_READLINK()\fP 675.DS 676union readlinkres switch (stat status) { 677 case NFS_OK: 678 path data; 679 default: 680 void; 681}; 682 683readlinkres 684NFSPROC_READLINK(fhandle) = 5; 685.DE 686.KE 687If "status" has the value 688.I NFS_OK , 689then the reply "data" is the data in 690the symbolic link given by the file referred to by the fhandle argument. 691.LP 692Note: since NFS always parses pathnames on the client, the 693pathname in a symbolic link may mean something different (or be 694meaningless) on a different client or on the server if a different 695pathname syntax is used. 696.KS 697.NH 3 698\&Read From File 699.IX "NFS server procedures" NFSPROC_READ "" \fINFSPROC_READ\fP 700.DS 701struct readargs { 702 fhandle file; 703 unsigned offset; 704 unsigned count; 705 unsigned totalcount; 706}; 707 708union readres switch (stat status) { 709 case NFS_OK: 710 fattr attributes; 711 opaque data<NFS_MAXDATA>; 712 default: 713 void; 714}; 715 716readres 717NFSPROC_READ(readargs) = 6; 718.DE 719.KE 720Returns up to "count" bytes of "data" from the file given by 721"file", starting at "offset" bytes from the beginning of the file. 722The first byte of the file is at offset zero. The file attributes 723after the read takes place are returned in "attributes". 724.LP 725Note: The argument "totalcount" is unused, and is removed in the 726next protocol revision. 727.KS 728.NH 3 729\&Write to Cache 730.IX "NFS server procedures" NFSPROC_WRITECACHE() "" \fINFSPROC_WRITECACHE()\fP 731.DS 732void 733NFSPROC_WRITECACHE(void) = 7; 734.DE 735.KE 736To be used in the next protocol revision. 737.KS 738.NH 3 739\&Write to File 740.IX "NFS server procedures" NFSPROC_WRITE() "" \fINFSPROC_WRITE()\fP 741.DS 742struct writeargs { 743 fhandle file; 744 unsigned beginoffset; 745 unsigned offset; 746 unsigned totalcount; 747 opaque data<NFS_MAXDATA>; 748}; 749 750attrstat 751NFSPROC_WRITE(writeargs) = 8; 752.DE 753.KE 754Writes "data" beginning "offset" bytes from the beginning of 755"file". The first byte of the file is at offset zero. If the 756reply "status" is NFS_OK, then the reply "attributes" contains the 757attributes of the file after the write has completed. The write 758operation is atomic. Data from this call to 759.I WRITE 760will not be mixed with data from another client's calls. 761.LP 762Note: The arguments "beginoffset" and "totalcount" are ignored and 763are removed in the next protocol revision. 764.KS 765.NH 3 766\&Create File 767.IX "NFS server procedures" NFSPROC_CREATE() "" \fINFSPROC_CREATE()\fP 768.DS 769struct createargs { 770 diropargs where; 771 sattr attributes; 772}; 773 774diropres 775NFSPROC_CREATE(createargs) = 9; 776.DE 777.KE 778The file "name" is created in the directory given by "dir". The 779initial attributes of the new file are given by "attributes". A 780reply "status" of NFS_OK indicates that the file was created, and 781reply "file" and reply "attributes" are its file handle and 782attributes. Any other reply "status" means that the operation 783failed and no file was created. 784.LP 785Note: This routine should pass an exclusive create flag, meaning 786"create the file only if it is not already there". 787.KS 788.NH 3 789\&Remove File 790.IX "NFS server procedures" NFSPROC_REMOVE() "" \fINFSPROC_REMOVE()\fP 791.DS 792stat 793NFSPROC_REMOVE(diropargs) = 10; 794.DE 795.KE 796The file "name" is removed from the directory given by "dir". A 797reply of NFS_OK means the directory entry was removed. 798.LP 799Note: possibly non-idempotent operation. 800.KS 801.NH 3 802\&Rename File 803.IX "NFS server procedures" NFSPROC_RENAME() "" \fINFSPROC_RENAME()\fP 804.DS 805struct renameargs { 806 diropargs from; 807 diropargs to; 808}; 809 810stat 811NFSPROC_RENAME(renameargs) = 11; 812.DE 813.KE 814The existing file "from.name" in the directory given by "from.dir" 815is renamed to "to.name" in the directory given by "to.dir". If the 816reply is 817.I NFS_OK , 818the file was renamed. The 819RENAME 820operation is 821atomic on the server; it cannot be interrupted in the middle. 822.LP 823Note: possibly non-idempotent operation. 824.KS 825.NH 3 826\&Create Link to File 827.IX "NFS server procedures" NFSPROC_LINK() "" \fINFSPROC_LINK()\fP 828.DS 829struct linkargs { 830 fhandle from; 831 diropargs to; 832}; 833 834stat 835NFSPROC_LINK(linkargs) = 12; 836.DE 837.KE 838Creates the file "to.name" in the directory given by "to.dir", 839which is a hard link to the existing file given by "from". If the 840return value is 841.I NFS_OK , 842a link was created. Any other return value 843indicates an error, and the link was not created. 844.LP 845A hard link should have the property that changes to either of the 846linked files are reflected in both files. When a hard link is made 847to a file, the attributes for the file should have a value for 848"nlink" that is one greater than the value before the link. 849.LP 850Note: possibly non-idempotent operation. 851.KS 852.NH 3 853\&Create Symbolic Link 854.IX "NFS server procedures" NFSPROC_SYMLINK() "" \fINFSPROC_SYMLINK()\fP 855.DS 856struct symlinkargs { 857 diropargs from; 858 path to; 859 sattr attributes; 860}; 861 862stat 863NFSPROC_SYMLINK(symlinkargs) = 13; 864.DE 865.KE 866Creates the file "from.name" with ftype 867.I NFLNK 868in the directory 869given by "from.dir". The new file contains the pathname "to" and 870has initial attributes given by "attributes". If the return value 871is 872.I NFS_OK , 873a link was created. Any other return value indicates an 874error, and the link was not created. 875.LP 876A symbolic link is a pointer to another file. The name given in 877"to" is not interpreted by the server, only stored in the newly 878created file. When the client references a file that is a symbolic 879link, the contents of the symbolic link are normally transparently 880reinterpreted as a pathname to substitute. A 881.I READLINK 882operation returns the data to the client for interpretation. 883.LP 884Note: On UNIX servers the attributes are never used, since 885symbolic links always have mode 0777. 886.KS 887.NH 3 888\&Create Directory 889.IX "NFS server procedures" NFSPROC_MKDIR() "" \fINFSPROC_MKDIR()\fP 890.DS 891diropres 892NFSPROC_MKDIR (createargs) = 14; 893.DE 894.KE 895The new directory "where.name" is created in the directory given by 896"where.dir". The initial attributes of the new directory are given 897by "attributes". A reply "status" of NFS_OK indicates that the new 898directory was created, and reply "file" and reply "attributes" are 899its file handle and attributes. Any other reply "status" means 900that the operation failed and no directory was created. 901.LP 902Note: possibly non-idempotent operation. 903.KS 904.NH 3 905\&Remove Directory 906.IX "NFS server procedures" NFSPROC_RMDIR() "" \fINFSPROC_RMDIR()\fP 907.DS 908stat 909NFSPROC_RMDIR(diropargs) = 15; 910.DE 911.KE 912The existing empty directory "name" in the directory given by "dir" 913is removed. If the reply is 914.I NFS_OK , 915the directory was removed. 916.LP 917Note: possibly non-idempotent operation. 918.KS 919.NH 3 920\&Read From Directory 921.IX "NFS server procedures" NFSPROC_READDIR() "" \fINFSPROC_READDIR()\fP 922.DS 923struct readdirargs { 924 fhandle dir; 925 nfscookie cookie; 926 unsigned count; 927}; 928 929struct entry { 930 unsigned fileid; 931 filename name; 932 nfscookie cookie; 933 entry *nextentry; 934}; 935 936union readdirres switch (stat status) { 937 case NFS_OK: 938 struct { 939 entry *entries; 940 bool eof; 941 } readdirok; 942 default: 943 void; 944}; 945 946readdirres 947NFSPROC_READDIR (readdirargs) = 16; 948.DE 949.KE 950Returns a variable number of directory entries, with a total size 951of up to "count" bytes, from the directory given by "dir". If the 952returned value of "status" is 953.I NFS_OK , 954then it is followed by a 955variable number of "entry"s. Each "entry" contains a "fileid" 956which consists of a unique number to identify the file within a 957filesystem, the "name" of the file, and a "cookie" which is an 958opaque pointer to the next entry in the directory. The cookie is 959used in the next 960.I READDIR 961call to get more entries starting at a 962given point in the directory. The special cookie zero (all bits 963zero) can be used to get the entries starting at the beginning of 964the directory. The "fileid" field should be the same number as the 965"fileid" in the the attributes of the file. (See the 966.I "Basic Data Types" 967section.) 968The "eof" flag has a value of 969.I TRUE 970if there are no more entries in the directory. 971.KS 972.NH 3 973\&Get Filesystem Attributes 974.IX "NFS server procedures" NFSPROC_STATFS() "" \fINFSPROC_STATFS()\fP 975.DS 976union statfsres (stat status) { 977 case NFS_OK: 978 struct { 979 unsigned tsize; 980 unsigned bsize; 981 unsigned blocks; 982 unsigned bfree; 983 unsigned bavail; 984 } info; 985 default: 986 void; 987}; 988 989statfsres 990NFSPROC_STATFS(fhandle) = 17; 991.DE 992.KE 993If the reply "status" is 994.I NFS_OK , 995then the reply "info" gives the 996attributes for the filesystem that contains file referred to by the 997input fhandle. The attribute fields contain the following values: 998.IP tsize: 999The optimum transfer size of the server in bytes. This is 1000the number of bytes the server would like to have in the 1001data part of READ and WRITE requests. 1002.IP bsize: 1003The block size in bytes of the filesystem. 1004.IP blocks: 1005The total number of "bsize" blocks on the filesystem. 1006.IP bfree: 1007The number of free "bsize" blocks on the filesystem. 1008.IP bavail: 1009The number of "bsize" blocks available to non-privileged users. 1010.LP 1011Note: This call does not work well if a filesystem has variable 1012size blocks. 1013.NH 1 1014\&NFS Implementation Issues 1015.IX NFS implementation 1016.LP 1017The NFS protocol is designed to be operating system independent, but 1018since this version was designed in a UNIX environment, many 1019operations have semantics similar to the operations of the UNIX file 1020system. This section discusses some of the implementation-specific 1021semantic issues. 1022.NH 2 1023\&Server/Client Relationship 1024.IX NFS "server/client relationship" 1025.LP 1026The NFS protocol is designed to allow servers to be as simple and 1027general as possible. Sometimes the simplicity of the server can be a 1028problem, if the client wants to implement complicated filesystem 1029semantics. 1030.LP 1031For example, some operating systems allow removal of open files. A 1032process can open a file and, while it is open, remove it from the 1033directory. The file can be read and written as long as the process 1034keeps it open, even though the file has no name in the filesystem. 1035It is impossible for a stateless server to implement these semantics. 1036The client can do some tricks such as renaming the file on remove, 1037and only removing it on close. We believe that the server provides 1038enough functionality to implement most file system semantics on the 1039client. 1040.LP 1041Every NFS client can also potentially be a server, and remote and 1042local mounted filesystems can be freely intermixed. This leads to 1043some interesting problems when a client travels down the directory 1044tree of a remote filesystem and reaches the mount point on the server 1045for another remote filesystem. Allowing the server to follow the 1046second remote mount would require loop detection, server lookup, and 1047user revalidation. Instead, we decided not to let clients cross a 1048server's mount point. When a client does a LOOKUP on a directory on 1049which the server has mounted a filesystem, the client sees the 1050underlying directory instead of the mounted directory. A client can 1051do remote mounts that match the server's mount points to maintain the 1052server's view. 1053.LP 1054.NH 2 1055\&Pathname Interpretation 1056.IX NFS "pathname interpretation" 1057.LP 1058There are a few complications to the rule that pathnames are always 1059parsed on the client. For example, symbolic links could have 1060different interpretations on different clients. Another common 1061problem for non-UNIX implementations is the special interpretation of 1062the pathname ".." to mean the parent of a given directory. The next 1063revision of the protocol uses an explicit flag to indicate the parent 1064instead. 1065.NH 2 1066\&Permission Issues 1067.IX NFS "permission issues" 1068.LP 1069The NFS protocol, strictly speaking, does not define the permission 1070checking used by servers. However, it is expected that a server 1071will do normal operating system permission checking using 1072.I AUTH_UNIX 1073style authentication as the basis of its protection mechanism. The 1074server gets the client's effective "uid", effective "gid", and groups 1075on each call and uses them to check permission. There are various 1076problems with this method that can been resolved in interesting ways. 1077.LP 1078Using "uid" and "gid" implies that the client and server share the 1079same "uid" list. Every server and client pair must have the same 1080mapping from user to "uid" and from group to "gid". Since every 1081client can also be a server, this tends to imply that the whole 1082network shares the same "uid/gid" space. 1083.I AUTH_DES 1084(and the next 1085revision of the NFS protocol) uses string names instead of numbers, 1086but there are still complex problems to be solved. 1087.LP 1088Another problem arises due to the usually stateful open operation. 1089Most operating systems check permission at open time, and then check 1090that the file is open on each read and write request. With stateless 1091servers, the server has no idea that the file is open and must do 1092permission checking on each read and write call. On a local 1093filesystem, a user can open a file and then change the permissions so 1094that no one is allowed to touch it, but will still be able to write 1095to the file because it is open. On a remote filesystem, by contrast, 1096the write would fail. To get around this problem, the server's 1097permission checking algorithm should allow the owner of a file to 1098access it regardless of the permission setting. 1099.LP 1100A similar problem has to do with paging in from a file over the 1101network. The operating system usually checks for execute permission 1102before opening a file for demand paging, and then reads blocks from 1103the open file. The file may not have read permission, but after it 1104is opened it doesn't matter. An NFS server can not tell the 1105difference between a normal file read and a demand page-in read. To 1106make this work, the server allows reading of files if the "uid" given 1107in the call has execute or read permission on the file. 1108.LP 1109In most operating systems, a particular user (on the user ID zero) 1110has access to all files no matter what permission and ownership they 1111have. This "super-user" permission may not be allowed on the server, 1112since anyone who can become super-user on their workstation could 1113gain access to all remote files. The UNIX server by default maps 1114user id 0 to -2 before doing its access checking. This works except 1115for NFS root filesystems, where super-user access cannot be avoided. 1116.NH 2 1117\&Setting RPC Parameters 1118.IX NFS "setting RPC parameters" 1119.LP 1120Various file system parameters and options should be set at mount 1121time. The mount protocol is described in the appendix below. For 1122example, "Soft" mounts as well as "Hard" mounts are usually both 1123provided. Soft mounted file systems return errors when RPC 1124operations fail (after a given number of optional retransmissions), 1125while hard mounted file systems continue to retransmit forever. 1126Clients and servers may need to keep caches of recent operations to 1127help avoid problems with non-idempotent operations. 1128.NH 1 1129\&Mount Protocol Definition 1130.IX "mount protocol" "" "" "" PAGE MAJOR 1131.sp 1 1132.NH 2 1133\&Introduction 1134.IX "mount protocol" introduction 1135.LP 1136The mount protocol is separate from, but related to, the NFS 1137protocol. It provides operating system specific services to get the 1138NFS off the ground -- looking up server path names, validating user 1139identity, and checking access permissions. Clients use the mount 1140protocol to get the first file handle, which allows them entry into a 1141remote filesystem. 1142.LP 1143The mount protocol is kept separate from the NFS protocol to make it 1144easy to plug in new access checking and validation methods without 1145changing the NFS server protocol. 1146.LP 1147Notice that the protocol definition implies stateful servers because 1148the server maintains a list of client's mount requests. The mount 1149list information is not critical for the correct functioning of 1150either the client or the server. It is intended for advisory use 1151only, for example, to warn possible clients when a server is going 1152down. 1153.LP 1154Version one of the mount protocol is used with version two of the NFS 1155protocol. The only connecting point is the 1156.I fhandle 1157structure, which is the same for both protocols. 1158.NH 2 1159\&RPC Information 1160.IX "mount protocol" "RPC information" 1161.IP \fIAuthentication\fP 1162The mount service uses 1163.I AUTH_UNIX 1164and 1165.I AUTH_DES 1166style authentication only. 1167.IP "\fITransport Protocols\fP" 1168The mount service is currently supported on UDP/IP only. 1169.IP "\fIPort Number\fP" 1170Consult the server's portmapper, described in the chapter 1171.I "Remote Procedure Calls: Protocol Specification", 1172to find the port number on which the mount service is registered. 1173.NH 2 1174\&Sizes of XDR Structures 1175.IX "mount protocol" "XDR structure sizes" 1176.LP 1177These are the sizes, given in decimal bytes, of various XDR 1178structures used in the protocol: 1179.DS 1180/* \fIThe maximum number of bytes in a pathname argument\fP */ 1181const MNTPATHLEN = 1024; 1182 1183/* \fIThe maximum number of bytes in a name argument\fP */ 1184const MNTNAMLEN = 255; 1185 1186/* \fIThe size in bytes of the opaque file handle\fP */ 1187const FHSIZE = 32; 1188.DE 1189.NH 2 1190\&Basic Data Types 1191.IX "mount protocol" "basic data types" 1192.IX "mount data types" 1193.LP 1194This section presents the data types used by the mount protocol. 1195In many cases they are similar to the types used in NFS. 1196.KS 1197.NH 3 1198\&fhandle 1199.IX "mount data types" fhandle "" \fIfhandle\fP 1200.DS 1201typedef opaque fhandle[FHSIZE]; 1202.DE 1203.KE 1204The type 1205.I fhandle 1206is the file handle that the server passes to the 1207client. All file operations are done using file handles to refer 1208to a file or directory. The file handle can contain whatever 1209information the server needs to distinguish an individual file. 1210.LP 1211This is the same as the "fhandle" XDR definition in version 2 of 1212the NFS protocol; see 1213.I "Basic Data Types" 1214in the definition of the NFS protocol, above. 1215.KS 1216.NH 3 1217\&fhstatus 1218.IX "mount data types" fhstatus "" \fIfhstatus\fP 1219.DS 1220union fhstatus switch (unsigned status) { 1221 case 0: 1222 fhandle directory; 1223 default: 1224 void; 1225}; 1226.DE 1227.KE 1228The type 1229.I fhstatus 1230is a union. If a "status" of zero is returned, 1231the call completed successfully, and a file handle for the 1232"directory" follows. A non-zero status indicates some sort of 1233error. In this case the status is a UNIX error number. 1234.KS 1235.NH 3 1236\&dirpath 1237.IX "mount data types" dirpath "" \fIdirpath\fP 1238.DS 1239typedef string dirpath<MNTPATHLEN>; 1240.DE 1241.KE 1242The type 1243.I dirpath 1244is a server pathname of a directory. 1245.KS 1246.NH 3 1247\&name 1248.IX "mount data types" name "" \fIname\fP 1249.DS 1250typedef string name<MNTNAMLEN>; 1251.DE 1252.KE 1253The type 1254.I name 1255is an arbitrary string used for various names. 1256.NH 2 1257\&Server Procedures 1258.IX "mount server procedures" 1259.LP 1260The following sections define the RPC procedures supplied by a 1261mount server. 1262.ie t .DS 1263.el .DS L 1264.ft I 1265/* 1266* Protocol description for the mount program 1267*/ 1268.ft CW 1269 1270program MOUNTPROG { 1271.ft I 1272/* 1273* Version 1 of the mount protocol used with 1274* version 2 of the NFS protocol. 1275*/ 1276.ft CW 1277 version MOUNTVERS { 1278 void MOUNTPROC_NULL(void) = 0; 1279 fhstatus MOUNTPROC_MNT(dirpath) = 1; 1280 mountlist MOUNTPROC_DUMP(void) = 2; 1281 void MOUNTPROC_UMNT(dirpath) = 3; 1282 void MOUNTPROC_UMNTALL(void) = 4; 1283 exportlist MOUNTPROC_EXPORT(void) = 5; 1284 } = 1; 1285} = 100005; 1286.DE 1287.KS 1288.NH 3 1289\&Do Nothing 1290.IX "mount server procedures" MNTPROC_NULL() "" \fIMNTPROC_NULL()\fP 1291.DS 1292void 1293MNTPROC_NULL(void) = 0; 1294.DE 1295.KE 1296This procedure does no work. It is made available in all RPC 1297services to allow server response testing and timing. 1298.KS 1299.NH 3 1300\&Add Mount Entry 1301.IX "mount server procedures" MNTPROC_MNT() "" \fIMNTPROC_MNT()\fP 1302.DS 1303fhstatus 1304MNTPROC_MNT(dirpath) = 1; 1305.DE 1306.KE 1307If the reply "status" is 0, then the reply "directory" contains the 1308file handle for the directory "dirname". This file handle may be 1309used in the NFS protocol. This procedure also adds a new entry to 1310the mount list for this client mounting "dirname". 1311.KS 1312.NH 3 1313\&Return Mount Entries 1314.IX "mount server procedures" MNTPROC_DUMP() "" \fIMNTPROC_DUMP()\fP 1315.DS 1316struct *mountlist { 1317 name hostname; 1318 dirpath directory; 1319 mountlist nextentry; 1320}; 1321 1322mountlist 1323MNTPROC_DUMP(void) = 2; 1324.DE 1325.KE 1326Returns the list of remote mounted filesystems. The "mountlist" 1327contains one entry for each "hostname" and "directory" pair. 1328.KS 1329.NH 3 1330\&Remove Mount Entry 1331.IX "mount server procedures" MNTPROC_UMNT() "" \fIMNTPROC_UMNT()\fP 1332.DS 1333void 1334MNTPROC_UMNT(dirpath) = 3; 1335.DE 1336.KE 1337Removes the mount list entry for the input "dirpath". 1338.KS 1339.NH 3 1340\&Remove All Mount Entries 1341.IX "mount server procedures" MNTPROC_UMNTALL() "" \fIMNTPROC_UMNTALL()\fP 1342.DS 1343void 1344MNTPROC_UMNTALL(void) = 4; 1345.DE 1346.KE 1347Removes all of the mount list entries for this client. 1348.KS 1349.NH 3 1350\&Return Export List 1351.IX "mount server procedures" MNTPROC_EXPORT() "" \fIMNTPROC_EXPORT()\fP 1352.DS 1353struct *groups { 1354 name grname; 1355 groups grnext; 1356}; 1357 1358struct *exportlist { 1359 dirpath filesys; 1360 groups groups; 1361 exportlist next; 1362}; 1363 1364exportlist 1365MNTPROC_EXPORT(void) = 5; 1366.DE 1367.KE 1368Returns a variable number of export list entries. Each entry 1369contains a filesystem name and a list of groups that are allowed to 1370import it. The filesystem name is in "filesys", and the group name 1371is in the list "groups". 1372.LP 1373Note: The exportlist should contain 1374more information about the status of the filesystem, such as a 1375read-only flag.
| 7.de BT 8.if \\n%=1 .tl ''- % -'' 9.. 10.ND 11.\" prevent excess underlining in nroff 12.if n .fp 2 R 13.OH 'Network File System: Version 2 Protocol Specification''Page %' 14.EH 'Page %''Network File System: Version 2 Protocol Specification' 15.if \n%=1 .bp 16.SH 17\&Network File System: Version 2 Protocol Specification 18.IX NFS "" "" "" PAGE MAJOR 19.IX "Network File System" "" "" "" PAGE MAJOR 20.IX NFS "version-2 protocol specification" 21.IX "Network File System" "version-2 protocol specification" 22.LP 23.NH 0 24\&Status of this Standard 25.LP 26Note: This document specifies a protocol that Sun Microsystems, Inc., 27and others are using. It specifies it in standard ARPA RFC form. 28.NH 1 29\&Introduction 30.IX NFS introduction 31.LP 32The Sun Network Filesystem (NFS) protocol provides transparent remote 33access to shared filesystems over local area networks. The NFS 34protocol is designed to be machine, operating system, network architecture, 35and transport protocol independent. This independence is 36achieved through the use of Remote Procedure Call (RPC) primitives 37built on top of an External Data Representation (XDR). Implementations 38exist for a variety of machines, from personal computers to 39supercomputers. 40.LP 41The supporting mount protocol allows the server to hand out remote 42access privileges to a restricted set of clients. It performs the 43operating system-specific functions that allow, for example, to 44attach remote directory trees to some local file system. 45.NH 2 46\&Remote Procedure Call 47.IX "Remote Procedure Call" 48.LP 49Sun's remote procedure call specification provides a procedure- 50oriented interface to remote services. Each server supplies a 51program that is a set of procedures. NFS is one such "program". 52The combination of host address, program number, and procedure 53number specifies one remote service procedure. RPC does not depend 54on services provided by specific protocols, so it can be used with 55any underlying transport protocol. See the 56.I "Remote Procedure Calls: Protocol Specification" 57chapter of this manual. 58.NH 2 59\&External Data Representation 60.IX "External Data Representation" 61.LP 62The External Data Representation (XDR) standard provides a common 63way of representing a set of data types over a network. 64The NFS 65Protocol Specification is written using the RPC data description 66language. 67For more information, see the 68.I " External Data Representation Standard: Protocol Specification." 69Sun provides implementations of XDR and 70RPC, but NFS does not require their use. Any software that 71provides equivalent functionality can be used, and if the encoding 72is exactly the same it can interoperate with other implementations 73of NFS. 74.NH 2 75\&Stateless Servers 76.IX "stateless servers" 77.IX servers stateless 78.LP 79The NFS protocol is stateless. That is, a server does not need to 80maintain any extra state information about any of its clients in 81order to function correctly. Stateless servers have a distinct 82advantage over stateful servers in the event of a failure. With 83stateless servers, a client need only retry a request until the 84server responds; it does not even need to know that the server has 85crashed, or the network temporarily went down. The client of a 86stateful server, on the other hand, needs to either detect a server 87crash and rebuild the server's state when it comes back up, or 88cause client operations to fail. 89.LP 90This may not sound like an important issue, but it affects the 91protocol in some unexpected ways. We feel that it is worth a bit 92of extra complexity in the protocol to be able to write very simple 93servers that do not require fancy crash recovery. 94.LP 95On the other hand, NFS deals with objects such as files and 96directories that inherently have state -- what good would a file be 97if it did not keep its contents intact? The goal is to not 98introduce any extra state in the protocol itself. Another way to 99simplify recovery is by making operations "idempotent" whenever 100possible (so that they can potentially be repeated). 101.NH 1 102\&NFS Protocol Definition 103.IX NFS "protocol definition" 104.IX NFS protocol 105.LP 106Servers have been known to change over time, and so can the 107protocol that they use. So RPC provides a version number with each 108RPC request. This RFC describes version two of the NFS protocol. 109Even in the second version, there are various obsolete procedures 110and parameters, which will be removed in later versions. An RFC 111for version three of the NFS protocol is currently under 112preparation. 113.NH 2 114\&File System Model 115.IX filesystem model 116.LP 117NFS assumes a file system that is hierarchical, with directories as 118all but the bottom-level files. Each entry in a directory (file, 119directory, device, etc.) has a string name. Different operating 120systems may have restrictions on the depth of the tree or the names 121used, as well as using different syntax to represent the "pathname", 122which is the concatenation of all the "components" (directory and 123file names) in the name. A "file system" is a tree on a single 124server (usually a single disk or physical partition) with a specified 125"root". Some operating systems provide a "mount" operation to make 126all file systems appear as a single tree, while others maintain a 127"forest" of file systems. Files are unstructured streams of 128uninterpreted bytes. Version 3 of NFS uses a slightly more general 129file system model. 130.LP 131NFS looks up one component of a pathname at a time. It may not be 132obvious why it does not just take the whole pathname, traipse down 133the directories, and return a file handle when it is done. There are 134several good reasons not to do this. First, pathnames need 135separators between the directory components, and different operating 136systems use different separators. We could define a Network Standard 137Pathname Representation, but then every pathname would have to be 138parsed and converted at each end. Other issues are discussed in 139\fINFS Implementation Issues\fP below. 140.LP 141Although files and directories are similar objects in many ways, 142different procedures are used to read directories and files. This 143provides a network standard format for representing directories. The 144same argument as above could have been used to justify a procedure 145that returns only one directory entry per call. The problem is 146efficiency. Directories can contain many entries, and a remote call 147to return each would be just too slow. 148.NH 2 149\&RPC Information 150.IX NFS "RPC information" 151.IP \fIAuthentication\fP 152The NFS service uses 153.I AUTH_UNIX , 154.I AUTH_DES , 155or 156.I AUTH_SHORT 157style 158authentication, except in the NULL procedure where 159.I AUTH_NONE 160is also allowed. 161.IP "\fITransport Protocols\fP" 162NFS currently is supported on UDP/IP only. 163.IP "\fIPort Number\fP" 164The NFS protocol currently uses the UDP port number 2049. This is 165not an officially assigned port, so later versions of the protocol 166use the \*QPortmapping\*U facility of RPC. 167.NH 2 168\&Sizes of XDR Structures 169.IX "XDR structure sizes" 170.LP 171These are the sizes, given in decimal bytes, of various XDR 172structures used in the protocol: 173.DS 174/* \fIThe maximum number of bytes of data in a READ or WRITE request\fP */ 175const MAXDATA = 8192; 176 177/* \fIThe maximum number of bytes in a pathname argument\fP */ 178const MAXPATHLEN = 1024; 179 180/* \fIThe maximum number of bytes in a file name argument\fP */ 181const MAXNAMLEN = 255; 182 183/* \fIThe size in bytes of the opaque "cookie" passed by READDIR\fP */ 184const COOKIESIZE = 4; 185 186/* \fIThe size in bytes of the opaque file handle\fP */ 187const FHSIZE = 32; 188.DE 189.NH 2 190\&Basic Data Types 191.IX "NFS data types" 192.IX NFS "basic data types" 193.LP 194The following XDR definitions are basic structures and types used 195in other structures described further on. 196.KS 197.NH 3 198\&stat 199.IX "NFS data types" stat "" \fIstat\fP 200.DS 201enum stat { 202 NFS_OK = 0, 203 NFSERR_PERM=1, 204 NFSERR_NOENT=2, 205 NFSERR_IO=5, 206 NFSERR_NXIO=6, 207 NFSERR_ACCES=13, 208 NFSERR_EXIST=17, 209 NFSERR_NODEV=19, 210 NFSERR_NOTDIR=20, 211 NFSERR_ISDIR=21, 212 NFSERR_FBIG=27, 213 NFSERR_NOSPC=28, 214 NFSERR_ROFS=30, 215 NFSERR_NAMETOOLONG=63, 216 NFSERR_NOTEMPTY=66, 217 NFSERR_DQUOT=69, 218 NFSERR_STALE=70, 219 NFSERR_WFLUSH=99 220}; 221.DE 222.KE 223.LP 224The 225.I stat 226type is returned with every procedure's results. A 227value of 228.I NFS_OK 229indicates that the call completed successfully and 230the results are valid. The other values indicate some kind of 231error occurred on the server side during the servicing of the 232procedure. The error values are derived from UNIX error numbers. 233.IP \fBNFSERR_PERM\fP: 234Not owner. The caller does not have correct ownership 235to perform the requested operation. 236.IP \fBNFSERR_NOENT\fP: 237No such file or directory. The file or directory 238specified does not exist. 239.IP \fBNFSERR_IO\fP: 240Some sort of hard error occurred when the operation was 241in progress. This could be a disk error, for example. 242.IP \fBNFSERR_NXIO\fP: 243No such device or address. 244.IP \fBNFSERR_ACCES\fP: 245Permission denied. The caller does not have the 246correct permission to perform the requested operation. 247.IP \fBNFSERR_EXIST\fP: 248File exists. The file specified already exists. 249.IP \fBNFSERR_NODEV\fP: 250No such device. 251.IP \fBNFSERR_NOTDIR\fP: 252Not a directory. The caller specified a 253non-directory in a directory operation. 254.IP \fBNFSERR_ISDIR\fP: 255Is a directory. The caller specified a directory in 256a non- directory operation. 257.IP \fBNFSERR_FBIG\fP: 258File too large. The operation caused a file to grow 259beyond the server's limit. 260.IP \fBNFSERR_NOSPC\fP: 261No space left on device. The operation caused the 262server's filesystem to reach its limit. 263.IP \fBNFSERR_ROFS\fP: 264Read-only filesystem. Write attempted on a read-only filesystem. 265.IP \fBNFSERR_NAMETOOLONG\fP: 266File name too long. The file name in an operation was too long. 267.IP \fBNFSERR_NOTEMPTY\fP: 268Directory not empty. Attempted to remove a 269directory that was not empty. 270.IP \fBNFSERR_DQUOT\fP: 271Disk quota exceeded. The client's disk quota on the 272server has been exceeded. 273.IP \fBNFSERR_STALE\fP: 274The "fhandle" given in the arguments was invalid. 275That is, the file referred to by that file handle no longer exists, 276or access to it has been revoked. 277.IP \fBNFSERR_WFLUSH\fP: 278The server's write cache used in the 279.I WRITECACHE 280call got flushed to disk. 281.LP 282.KS 283.NH 3 284\&ftype 285.IX "NFS data types" ftype "" \fIftype\fP 286.DS 287enum ftype { 288 NFNON = 0, 289 NFREG = 1, 290 NFDIR = 2, 291 NFBLK = 3, 292 NFCHR = 4, 293 NFLNK = 5 294}; 295.DE 296.KE 297The enumeration 298.I ftype 299gives the type of a file. The type 300.I NFNON 301indicates a non-file, 302.I NFREG 303is a regular file, 304.I NFDIR 305is a directory, 306.I NFBLK 307is a block-special device, 308.I NFCHR 309is a character-special device, and 310.I NFLNK 311is a symbolic link. 312.KS 313.NH 3 314\&fhandle 315.IX "NFS data types" fhandle "" \fIfhandle\fP 316.DS 317typedef opaque fhandle[FHSIZE]; 318.DE 319.KE 320The 321.I fhandle 322is the file handle passed between the server and the client. 323All file operations are done using file handles to refer to a file or 324directory. The file handle can contain whatever information the server 325needs to distinguish an individual file. 326.KS 327.NH 3 328\&timeval 329.IX "NFS data types" timeval "" \fItimeval\fP 330.DS 331struct timeval { 332 unsigned int seconds; 333 unsigned int useconds; 334}; 335.DE 336.KE 337The 338.I timeval 339structure is the number of seconds and microseconds 340since midnight January 1, 1970, Greenwich Mean Time. It is used to 341pass time and date information. 342.KS 343.NH 3 344\&fattr 345.IX "NFS data types" fattr "" \fIfattr\fP 346.DS 347struct fattr { 348 ftype type; 349 unsigned int mode; 350 unsigned int nlink; 351 unsigned int uid; 352 unsigned int gid; 353 unsigned int size; 354 unsigned int blocksize; 355 unsigned int rdev; 356 unsigned int blocks; 357 unsigned int fsid; 358 unsigned int fileid; 359 timeval atime; 360 timeval mtime; 361 timeval ctime; 362}; 363.DE 364.KE 365The 366.I fattr 367structure contains the attributes of a file; "type" is the type of 368the file; "nlink" is the number of hard links to the file (the number 369of different names for the same file); "uid" is the user 370identification number of the owner of the file; "gid" is the group 371identification number of the group of the file; "size" is the size in 372bytes of the file; "blocksize" is the size in bytes of a block of the 373file; "rdev" is the device number of the file if it is type 374.I NFCHR 375or 376.I NFBLK ; 377"blocks" is the number of blocks the file takes up on disk; "fsid" is 378the file system identifier for the filesystem containing the file; 379"fileid" is a number that uniquely identifies the file within its 380filesystem; "atime" is the time when the file was last accessed for 381either read or write; "mtime" is the time when the file data was last 382modified (written); and "ctime" is the time when the status of the 383file was last changed. Writing to the file also changes "ctime" if 384the size of the file changes. 385.LP 386"mode" is the access mode encoded as a set of bits. Notice that the 387file type is specified both in the mode bits and in the file type. 388This is really a bug in the protocol and will be fixed in future 389versions. The descriptions given below specify the bit positions 390using octal numbers. 391.TS 392box tab (&) ; 393cfI cfI 394lfL l . 395Bit&Description 396_ 3970040000&This is a directory; "type" field should be NFDIR. 3980020000&This is a character special file; "type" field should be NFCHR. 3990060000&This is a block special file; "type" field should be NFBLK. 4000100000&This is a regular file; "type" field should be NFREG. 4010120000&This is a symbolic link file; "type" field should be NFLNK. 4020140000&This is a named socket; "type" field should be NFNON. 4030004000&Set user id on execution. 4040002000&Set group id on execution. 4050001000&Save swapped text even after use. 4060000400&Read permission for owner. 4070000200&Write permission for owner. 4080000100&Execute and search permission for owner. 4090000040&Read permission for group. 4100000020&Write permission for group. 4110000010&Execute and search permission for group. 4120000004&Read permission for others. 4130000002&Write permission for others. 4140000001&Execute and search permission for others. 415.TE 416.KS 417Notes: 418.IP 419The bits are the same as the mode bits returned by the 420.I stat(2) 421system call in the UNIX system. The file type is specified both in 422the mode bits and in the file type. This is fixed in future 423versions. 424.IP 425The "rdev" field in the attributes structure is an operating system 426specific device specifier. It will be removed and generalized in 427the next revision of the protocol. 428.KE 429.LP 430.KS 431.NH 3 432\&sattr 433.IX "NFS data types" sattr "" \fIsattr\fP 434.DS 435struct sattr { 436 unsigned int mode; 437 unsigned int uid; 438 unsigned int gid; 439 unsigned int size; 440 timeval atime; 441 timeval mtime; 442}; 443.DE 444.KE 445The 446.I sattr 447structure contains the file attributes which can be set 448from the client. The fields are the same as for 449.I fattr 450above. A "size" of zero means the file should be truncated. 451A value of -1 indicates a field that should be ignored. 452.LP 453.KS 454.NH 3 455\&filename 456.IX "NFS data types" filename "" \fIfilename\fP 457.DS 458typedef string filename<MAXNAMLEN>; 459.DE 460.KE 461The type 462.I filename 463is used for passing file names or pathname components. 464.LP 465.KS 466.NH 3 467\&path 468.IX "NFS data types" path "" \fIpath\fP 469.DS 470typedef string path<MAXPATHLEN>; 471.DE 472.KE 473The type 474.I path 475is a pathname. The server considers it as a string 476with no internal structure, but to the client it is the name of a 477node in a filesystem tree. 478.LP 479.KS 480.NH 3 481\&attrstat 482.IX "NFS data types" attrstat "" \fIattrstat\fP 483.DS 484union attrstat switch (stat status) { 485 case NFS_OK: 486 fattr attributes; 487 default: 488 void; 489}; 490.DE 491.KE 492The 493.I attrstat 494structure is a common procedure result. It contains 495a "status" and, if the call succeeded, it also contains the 496attributes of the file on which the operation was done. 497.LP 498.KS 499.NH 3 500\&diropargs 501.IX "NFS data types" diropargs "" \fIdiropargs\fP 502.DS 503struct diropargs { 504 fhandle dir; 505 filename name; 506}; 507.DE 508.KE 509The 510.I diropargs 511structure is used in directory operations. The 512"fhandle" "dir" is the directory in which to find the file "name". 513A directory operation is one in which the directory is affected. 514.LP 515.KS 516.NH 3 517\&diropres 518.IX "NFS data types" diropres "" \fIdiropres\fP 519.DS 520union diropres switch (stat status) { 521 case NFS_OK: 522 struct { 523 fhandle file; 524 fattr attributes; 525 } diropok; 526 default: 527 void; 528}; 529.DE 530.KE 531The results of a directory operation are returned in a 532.I diropres 533structure. If the call succeeded, a new file handle "file" and the 534"attributes" associated with that file are returned along with the 535"status". 536.NH 2 537\&Server Procedures 538.IX "NFS server procedures" "" "" "" PAGE MAJOR 539.LP 540The protocol definition is given as a set of procedures with 541arguments and results defined using the RPC language. A brief 542description of the function of each procedure should provide enough 543information to allow implementation. 544.LP 545All of the procedures in the NFS protocol are assumed to be 546synchronous. When a procedure returns to the client, the client 547can assume that the operation has completed and any data associated 548with the request is now on stable storage. For example, a client 549.I WRITE 550request may cause the server to update data blocks, 551filesystem information blocks (such as indirect blocks), and file 552attribute information (size and modify times). When the 553.I WRITE 554returns to the client, it can assume that the write is safe, even 555in case of a server crash, and it can discard the data written. 556This is a very important part of the statelessness of the server. 557If the server waited to flush data from remote requests, the client 558would have to save those requests so that it could resend them in 559case of a server crash. 560.ie t .DS 561.el .DS L 562 563.ft I 564/* 565* Remote file service routines 566*/ 567.ft CW 568program NFS_PROGRAM { 569 version NFS_VERSION { 570 void NFSPROC_NULL(void) = 0; 571 attrstat NFSPROC_GETATTR(fhandle) = 1; 572 attrstat NFSPROC_SETATTR(sattrargs) = 2; 573 void NFSPROC_ROOT(void) = 3; 574 diropres NFSPROC_LOOKUP(diropargs) = 4; 575 readlinkres NFSPROC_READLINK(fhandle) = 5; 576 readres NFSPROC_READ(readargs) = 6; 577 void NFSPROC_WRITECACHE(void) = 7; 578 attrstat NFSPROC_WRITE(writeargs) = 8; 579 diropres NFSPROC_CREATE(createargs) = 9; 580 stat NFSPROC_REMOVE(diropargs) = 10; 581 stat NFSPROC_RENAME(renameargs) = 11; 582 stat NFSPROC_LINK(linkargs) = 12; 583 stat NFSPROC_SYMLINK(symlinkargs) = 13; 584 diropres NFSPROC_MKDIR(createargs) = 14; 585 stat NFSPROC_RMDIR(diropargs) = 15; 586 readdirres NFSPROC_READDIR(readdirargs) = 16; 587 statfsres NFSPROC_STATFS(fhandle) = 17; 588 } = 2; 589} = 100003; 590.DE 591.KS 592.NH 3 593\&Do Nothing 594.IX "NFS server procedures" NFSPROC_NULL() "" \fINFSPROC_NULL()\fP 595.DS 596void 597NFSPROC_NULL(void) = 0; 598.DE 599.KE 600This procedure does no work. It is made available in all RPC 601services to allow server response testing and timing. 602.KS 603.NH 3 604\&Get File Attributes 605.IX "NFS server procedures" NFSPROC_GETATTR() "" \fINFSPROC_GETATTR()\fP 606.DS 607attrstat 608NFSPROC_GETATTR (fhandle) = 1; 609.DE 610.KE 611If the reply status is 612.I NFS_OK , 613then the reply attributes contains 614the attributes for the file given by the input fhandle. 615.KS 616.NH 3 617\&Set File Attributes 618.IX "NFS server procedures" NFSPROC_SETATTR() "" \fINFSPROC_SETATTR()\fP 619.DS 620struct sattrargs { 621 fhandle file; 622 sattr attributes; 623 }; 624 625attrstat 626NFSPROC_SETATTR (sattrargs) = 2; 627.DE 628.KE 629The "attributes" argument contains fields which are either -1 or 630are the new value for the attributes of "file". If the reply 631status is 632.I NFS_OK , 633then the reply attributes have the attributes of 634the file after the "SETATTR" operation has completed. 635.LP 636Note: The use of -1 to indicate an unused field in "attributes" is 637changed in the next version of the protocol. 638.KS 639.NH 3 640\&Get Filesystem Root 641.IX "NFS server procedures" NFSPROC_ROOT "" \fINFSPROC_ROOT\fP 642.DS 643void 644NFSPROC_ROOT(void) = 3; 645.DE 646.KE 647Obsolete. This procedure is no longer used because finding the 648root file handle of a filesystem requires moving pathnames between 649client and server. To do this right we would have to define a 650network standard representation of pathnames. Instead, the 651function of looking up the root file handle is done by the 652.I MNTPROC_MNT() 653procedure. (See the 654.I "Mount Protocol Definition" 655later in this chapter for details). 656.KS 657.NH 3 658\&Look Up File Name 659.IX "NFS server procedures" NFSPROC_LOOKUP() "" \fINFSPROC_LOOKUP()\fP 660.DS 661diropres 662NFSPROC_LOOKUP(diropargs) = 4; 663.DE 664.KE 665If the reply "status" is 666.I NFS_OK , 667then the reply "file" and reply 668"attributes" are the file handle and attributes for the file "name" 669in the directory given by "dir" in the argument. 670.KS 671.NH 3 672\&Read From Symbolic Link 673.IX "NFS server procedures" NFSPROC_READLINK() "" \fINFSPROC_READLINK()\fP 674.DS 675union readlinkres switch (stat status) { 676 case NFS_OK: 677 path data; 678 default: 679 void; 680}; 681 682readlinkres 683NFSPROC_READLINK(fhandle) = 5; 684.DE 685.KE 686If "status" has the value 687.I NFS_OK , 688then the reply "data" is the data in 689the symbolic link given by the file referred to by the fhandle argument. 690.LP 691Note: since NFS always parses pathnames on the client, the 692pathname in a symbolic link may mean something different (or be 693meaningless) on a different client or on the server if a different 694pathname syntax is used. 695.KS 696.NH 3 697\&Read From File 698.IX "NFS server procedures" NFSPROC_READ "" \fINFSPROC_READ\fP 699.DS 700struct readargs { 701 fhandle file; 702 unsigned offset; 703 unsigned count; 704 unsigned totalcount; 705}; 706 707union readres switch (stat status) { 708 case NFS_OK: 709 fattr attributes; 710 opaque data<NFS_MAXDATA>; 711 default: 712 void; 713}; 714 715readres 716NFSPROC_READ(readargs) = 6; 717.DE 718.KE 719Returns up to "count" bytes of "data" from the file given by 720"file", starting at "offset" bytes from the beginning of the file. 721The first byte of the file is at offset zero. The file attributes 722after the read takes place are returned in "attributes". 723.LP 724Note: The argument "totalcount" is unused, and is removed in the 725next protocol revision. 726.KS 727.NH 3 728\&Write to Cache 729.IX "NFS server procedures" NFSPROC_WRITECACHE() "" \fINFSPROC_WRITECACHE()\fP 730.DS 731void 732NFSPROC_WRITECACHE(void) = 7; 733.DE 734.KE 735To be used in the next protocol revision. 736.KS 737.NH 3 738\&Write to File 739.IX "NFS server procedures" NFSPROC_WRITE() "" \fINFSPROC_WRITE()\fP 740.DS 741struct writeargs { 742 fhandle file; 743 unsigned beginoffset; 744 unsigned offset; 745 unsigned totalcount; 746 opaque data<NFS_MAXDATA>; 747}; 748 749attrstat 750NFSPROC_WRITE(writeargs) = 8; 751.DE 752.KE 753Writes "data" beginning "offset" bytes from the beginning of 754"file". The first byte of the file is at offset zero. If the 755reply "status" is NFS_OK, then the reply "attributes" contains the 756attributes of the file after the write has completed. The write 757operation is atomic. Data from this call to 758.I WRITE 759will not be mixed with data from another client's calls. 760.LP 761Note: The arguments "beginoffset" and "totalcount" are ignored and 762are removed in the next protocol revision. 763.KS 764.NH 3 765\&Create File 766.IX "NFS server procedures" NFSPROC_CREATE() "" \fINFSPROC_CREATE()\fP 767.DS 768struct createargs { 769 diropargs where; 770 sattr attributes; 771}; 772 773diropres 774NFSPROC_CREATE(createargs) = 9; 775.DE 776.KE 777The file "name" is created in the directory given by "dir". The 778initial attributes of the new file are given by "attributes". A 779reply "status" of NFS_OK indicates that the file was created, and 780reply "file" and reply "attributes" are its file handle and 781attributes. Any other reply "status" means that the operation 782failed and no file was created. 783.LP 784Note: This routine should pass an exclusive create flag, meaning 785"create the file only if it is not already there". 786.KS 787.NH 3 788\&Remove File 789.IX "NFS server procedures" NFSPROC_REMOVE() "" \fINFSPROC_REMOVE()\fP 790.DS 791stat 792NFSPROC_REMOVE(diropargs) = 10; 793.DE 794.KE 795The file "name" is removed from the directory given by "dir". A 796reply of NFS_OK means the directory entry was removed. 797.LP 798Note: possibly non-idempotent operation. 799.KS 800.NH 3 801\&Rename File 802.IX "NFS server procedures" NFSPROC_RENAME() "" \fINFSPROC_RENAME()\fP 803.DS 804struct renameargs { 805 diropargs from; 806 diropargs to; 807}; 808 809stat 810NFSPROC_RENAME(renameargs) = 11; 811.DE 812.KE 813The existing file "from.name" in the directory given by "from.dir" 814is renamed to "to.name" in the directory given by "to.dir". If the 815reply is 816.I NFS_OK , 817the file was renamed. The 818RENAME 819operation is 820atomic on the server; it cannot be interrupted in the middle. 821.LP 822Note: possibly non-idempotent operation. 823.KS 824.NH 3 825\&Create Link to File 826.IX "NFS server procedures" NFSPROC_LINK() "" \fINFSPROC_LINK()\fP 827.DS 828struct linkargs { 829 fhandle from; 830 diropargs to; 831}; 832 833stat 834NFSPROC_LINK(linkargs) = 12; 835.DE 836.KE 837Creates the file "to.name" in the directory given by "to.dir", 838which is a hard link to the existing file given by "from". If the 839return value is 840.I NFS_OK , 841a link was created. Any other return value 842indicates an error, and the link was not created. 843.LP 844A hard link should have the property that changes to either of the 845linked files are reflected in both files. When a hard link is made 846to a file, the attributes for the file should have a value for 847"nlink" that is one greater than the value before the link. 848.LP 849Note: possibly non-idempotent operation. 850.KS 851.NH 3 852\&Create Symbolic Link 853.IX "NFS server procedures" NFSPROC_SYMLINK() "" \fINFSPROC_SYMLINK()\fP 854.DS 855struct symlinkargs { 856 diropargs from; 857 path to; 858 sattr attributes; 859}; 860 861stat 862NFSPROC_SYMLINK(symlinkargs) = 13; 863.DE 864.KE 865Creates the file "from.name" with ftype 866.I NFLNK 867in the directory 868given by "from.dir". The new file contains the pathname "to" and 869has initial attributes given by "attributes". If the return value 870is 871.I NFS_OK , 872a link was created. Any other return value indicates an 873error, and the link was not created. 874.LP 875A symbolic link is a pointer to another file. The name given in 876"to" is not interpreted by the server, only stored in the newly 877created file. When the client references a file that is a symbolic 878link, the contents of the symbolic link are normally transparently 879reinterpreted as a pathname to substitute. A 880.I READLINK 881operation returns the data to the client for interpretation. 882.LP 883Note: On UNIX servers the attributes are never used, since 884symbolic links always have mode 0777. 885.KS 886.NH 3 887\&Create Directory 888.IX "NFS server procedures" NFSPROC_MKDIR() "" \fINFSPROC_MKDIR()\fP 889.DS 890diropres 891NFSPROC_MKDIR (createargs) = 14; 892.DE 893.KE 894The new directory "where.name" is created in the directory given by 895"where.dir". The initial attributes of the new directory are given 896by "attributes". A reply "status" of NFS_OK indicates that the new 897directory was created, and reply "file" and reply "attributes" are 898its file handle and attributes. Any other reply "status" means 899that the operation failed and no directory was created. 900.LP 901Note: possibly non-idempotent operation. 902.KS 903.NH 3 904\&Remove Directory 905.IX "NFS server procedures" NFSPROC_RMDIR() "" \fINFSPROC_RMDIR()\fP 906.DS 907stat 908NFSPROC_RMDIR(diropargs) = 15; 909.DE 910.KE 911The existing empty directory "name" in the directory given by "dir" 912is removed. If the reply is 913.I NFS_OK , 914the directory was removed. 915.LP 916Note: possibly non-idempotent operation. 917.KS 918.NH 3 919\&Read From Directory 920.IX "NFS server procedures" NFSPROC_READDIR() "" \fINFSPROC_READDIR()\fP 921.DS 922struct readdirargs { 923 fhandle dir; 924 nfscookie cookie; 925 unsigned count; 926}; 927 928struct entry { 929 unsigned fileid; 930 filename name; 931 nfscookie cookie; 932 entry *nextentry; 933}; 934 935union readdirres switch (stat status) { 936 case NFS_OK: 937 struct { 938 entry *entries; 939 bool eof; 940 } readdirok; 941 default: 942 void; 943}; 944 945readdirres 946NFSPROC_READDIR (readdirargs) = 16; 947.DE 948.KE 949Returns a variable number of directory entries, with a total size 950of up to "count" bytes, from the directory given by "dir". If the 951returned value of "status" is 952.I NFS_OK , 953then it is followed by a 954variable number of "entry"s. Each "entry" contains a "fileid" 955which consists of a unique number to identify the file within a 956filesystem, the "name" of the file, and a "cookie" which is an 957opaque pointer to the next entry in the directory. The cookie is 958used in the next 959.I READDIR 960call to get more entries starting at a 961given point in the directory. The special cookie zero (all bits 962zero) can be used to get the entries starting at the beginning of 963the directory. The "fileid" field should be the same number as the 964"fileid" in the the attributes of the file. (See the 965.I "Basic Data Types" 966section.) 967The "eof" flag has a value of 968.I TRUE 969if there are no more entries in the directory. 970.KS 971.NH 3 972\&Get Filesystem Attributes 973.IX "NFS server procedures" NFSPROC_STATFS() "" \fINFSPROC_STATFS()\fP 974.DS 975union statfsres (stat status) { 976 case NFS_OK: 977 struct { 978 unsigned tsize; 979 unsigned bsize; 980 unsigned blocks; 981 unsigned bfree; 982 unsigned bavail; 983 } info; 984 default: 985 void; 986}; 987 988statfsres 989NFSPROC_STATFS(fhandle) = 17; 990.DE 991.KE 992If the reply "status" is 993.I NFS_OK , 994then the reply "info" gives the 995attributes for the filesystem that contains file referred to by the 996input fhandle. The attribute fields contain the following values: 997.IP tsize: 998The optimum transfer size of the server in bytes. This is 999the number of bytes the server would like to have in the 1000data part of READ and WRITE requests. 1001.IP bsize: 1002The block size in bytes of the filesystem. 1003.IP blocks: 1004The total number of "bsize" blocks on the filesystem. 1005.IP bfree: 1006The number of free "bsize" blocks on the filesystem. 1007.IP bavail: 1008The number of "bsize" blocks available to non-privileged users. 1009.LP 1010Note: This call does not work well if a filesystem has variable 1011size blocks. 1012.NH 1 1013\&NFS Implementation Issues 1014.IX NFS implementation 1015.LP 1016The NFS protocol is designed to be operating system independent, but 1017since this version was designed in a UNIX environment, many 1018operations have semantics similar to the operations of the UNIX file 1019system. This section discusses some of the implementation-specific 1020semantic issues. 1021.NH 2 1022\&Server/Client Relationship 1023.IX NFS "server/client relationship" 1024.LP 1025The NFS protocol is designed to allow servers to be as simple and 1026general as possible. Sometimes the simplicity of the server can be a 1027problem, if the client wants to implement complicated filesystem 1028semantics. 1029.LP 1030For example, some operating systems allow removal of open files. A 1031process can open a file and, while it is open, remove it from the 1032directory. The file can be read and written as long as the process 1033keeps it open, even though the file has no name in the filesystem. 1034It is impossible for a stateless server to implement these semantics. 1035The client can do some tricks such as renaming the file on remove, 1036and only removing it on close. We believe that the server provides 1037enough functionality to implement most file system semantics on the 1038client. 1039.LP 1040Every NFS client can also potentially be a server, and remote and 1041local mounted filesystems can be freely intermixed. This leads to 1042some interesting problems when a client travels down the directory 1043tree of a remote filesystem and reaches the mount point on the server 1044for another remote filesystem. Allowing the server to follow the 1045second remote mount would require loop detection, server lookup, and 1046user revalidation. Instead, we decided not to let clients cross a 1047server's mount point. When a client does a LOOKUP on a directory on 1048which the server has mounted a filesystem, the client sees the 1049underlying directory instead of the mounted directory. A client can 1050do remote mounts that match the server's mount points to maintain the 1051server's view. 1052.LP 1053.NH 2 1054\&Pathname Interpretation 1055.IX NFS "pathname interpretation" 1056.LP 1057There are a few complications to the rule that pathnames are always 1058parsed on the client. For example, symbolic links could have 1059different interpretations on different clients. Another common 1060problem for non-UNIX implementations is the special interpretation of 1061the pathname ".." to mean the parent of a given directory. The next 1062revision of the protocol uses an explicit flag to indicate the parent 1063instead. 1064.NH 2 1065\&Permission Issues 1066.IX NFS "permission issues" 1067.LP 1068The NFS protocol, strictly speaking, does not define the permission 1069checking used by servers. However, it is expected that a server 1070will do normal operating system permission checking using 1071.I AUTH_UNIX 1072style authentication as the basis of its protection mechanism. The 1073server gets the client's effective "uid", effective "gid", and groups 1074on each call and uses them to check permission. There are various 1075problems with this method that can been resolved in interesting ways. 1076.LP 1077Using "uid" and "gid" implies that the client and server share the 1078same "uid" list. Every server and client pair must have the same 1079mapping from user to "uid" and from group to "gid". Since every 1080client can also be a server, this tends to imply that the whole 1081network shares the same "uid/gid" space. 1082.I AUTH_DES 1083(and the next 1084revision of the NFS protocol) uses string names instead of numbers, 1085but there are still complex problems to be solved. 1086.LP 1087Another problem arises due to the usually stateful open operation. 1088Most operating systems check permission at open time, and then check 1089that the file is open on each read and write request. With stateless 1090servers, the server has no idea that the file is open and must do 1091permission checking on each read and write call. On a local 1092filesystem, a user can open a file and then change the permissions so 1093that no one is allowed to touch it, but will still be able to write 1094to the file because it is open. On a remote filesystem, by contrast, 1095the write would fail. To get around this problem, the server's 1096permission checking algorithm should allow the owner of a file to 1097access it regardless of the permission setting. 1098.LP 1099A similar problem has to do with paging in from a file over the 1100network. The operating system usually checks for execute permission 1101before opening a file for demand paging, and then reads blocks from 1102the open file. The file may not have read permission, but after it 1103is opened it doesn't matter. An NFS server can not tell the 1104difference between a normal file read and a demand page-in read. To 1105make this work, the server allows reading of files if the "uid" given 1106in the call has execute or read permission on the file. 1107.LP 1108In most operating systems, a particular user (on the user ID zero) 1109has access to all files no matter what permission and ownership they 1110have. This "super-user" permission may not be allowed on the server, 1111since anyone who can become super-user on their workstation could 1112gain access to all remote files. The UNIX server by default maps 1113user id 0 to -2 before doing its access checking. This works except 1114for NFS root filesystems, where super-user access cannot be avoided. 1115.NH 2 1116\&Setting RPC Parameters 1117.IX NFS "setting RPC parameters" 1118.LP 1119Various file system parameters and options should be set at mount 1120time. The mount protocol is described in the appendix below. For 1121example, "Soft" mounts as well as "Hard" mounts are usually both 1122provided. Soft mounted file systems return errors when RPC 1123operations fail (after a given number of optional retransmissions), 1124while hard mounted file systems continue to retransmit forever. 1125Clients and servers may need to keep caches of recent operations to 1126help avoid problems with non-idempotent operations. 1127.NH 1 1128\&Mount Protocol Definition 1129.IX "mount protocol" "" "" "" PAGE MAJOR 1130.sp 1 1131.NH 2 1132\&Introduction 1133.IX "mount protocol" introduction 1134.LP 1135The mount protocol is separate from, but related to, the NFS 1136protocol. It provides operating system specific services to get the 1137NFS off the ground -- looking up server path names, validating user 1138identity, and checking access permissions. Clients use the mount 1139protocol to get the first file handle, which allows them entry into a 1140remote filesystem. 1141.LP 1142The mount protocol is kept separate from the NFS protocol to make it 1143easy to plug in new access checking and validation methods without 1144changing the NFS server protocol. 1145.LP 1146Notice that the protocol definition implies stateful servers because 1147the server maintains a list of client's mount requests. The mount 1148list information is not critical for the correct functioning of 1149either the client or the server. It is intended for advisory use 1150only, for example, to warn possible clients when a server is going 1151down. 1152.LP 1153Version one of the mount protocol is used with version two of the NFS 1154protocol. The only connecting point is the 1155.I fhandle 1156structure, which is the same for both protocols. 1157.NH 2 1158\&RPC Information 1159.IX "mount protocol" "RPC information" 1160.IP \fIAuthentication\fP 1161The mount service uses 1162.I AUTH_UNIX 1163and 1164.I AUTH_DES 1165style authentication only. 1166.IP "\fITransport Protocols\fP" 1167The mount service is currently supported on UDP/IP only. 1168.IP "\fIPort Number\fP" 1169Consult the server's portmapper, described in the chapter 1170.I "Remote Procedure Calls: Protocol Specification", 1171to find the port number on which the mount service is registered. 1172.NH 2 1173\&Sizes of XDR Structures 1174.IX "mount protocol" "XDR structure sizes" 1175.LP 1176These are the sizes, given in decimal bytes, of various XDR 1177structures used in the protocol: 1178.DS 1179/* \fIThe maximum number of bytes in a pathname argument\fP */ 1180const MNTPATHLEN = 1024; 1181 1182/* \fIThe maximum number of bytes in a name argument\fP */ 1183const MNTNAMLEN = 255; 1184 1185/* \fIThe size in bytes of the opaque file handle\fP */ 1186const FHSIZE = 32; 1187.DE 1188.NH 2 1189\&Basic Data Types 1190.IX "mount protocol" "basic data types" 1191.IX "mount data types" 1192.LP 1193This section presents the data types used by the mount protocol. 1194In many cases they are similar to the types used in NFS. 1195.KS 1196.NH 3 1197\&fhandle 1198.IX "mount data types" fhandle "" \fIfhandle\fP 1199.DS 1200typedef opaque fhandle[FHSIZE]; 1201.DE 1202.KE 1203The type 1204.I fhandle 1205is the file handle that the server passes to the 1206client. All file operations are done using file handles to refer 1207to a file or directory. The file handle can contain whatever 1208information the server needs to distinguish an individual file. 1209.LP 1210This is the same as the "fhandle" XDR definition in version 2 of 1211the NFS protocol; see 1212.I "Basic Data Types" 1213in the definition of the NFS protocol, above. 1214.KS 1215.NH 3 1216\&fhstatus 1217.IX "mount data types" fhstatus "" \fIfhstatus\fP 1218.DS 1219union fhstatus switch (unsigned status) { 1220 case 0: 1221 fhandle directory; 1222 default: 1223 void; 1224}; 1225.DE 1226.KE 1227The type 1228.I fhstatus 1229is a union. If a "status" of zero is returned, 1230the call completed successfully, and a file handle for the 1231"directory" follows. A non-zero status indicates some sort of 1232error. In this case the status is a UNIX error number. 1233.KS 1234.NH 3 1235\&dirpath 1236.IX "mount data types" dirpath "" \fIdirpath\fP 1237.DS 1238typedef string dirpath<MNTPATHLEN>; 1239.DE 1240.KE 1241The type 1242.I dirpath 1243is a server pathname of a directory. 1244.KS 1245.NH 3 1246\&name 1247.IX "mount data types" name "" \fIname\fP 1248.DS 1249typedef string name<MNTNAMLEN>; 1250.DE 1251.KE 1252The type 1253.I name 1254is an arbitrary string used for various names. 1255.NH 2 1256\&Server Procedures 1257.IX "mount server procedures" 1258.LP 1259The following sections define the RPC procedures supplied by a 1260mount server. 1261.ie t .DS 1262.el .DS L 1263.ft I 1264/* 1265* Protocol description for the mount program 1266*/ 1267.ft CW 1268 1269program MOUNTPROG { 1270.ft I 1271/* 1272* Version 1 of the mount protocol used with 1273* version 2 of the NFS protocol. 1274*/ 1275.ft CW 1276 version MOUNTVERS { 1277 void MOUNTPROC_NULL(void) = 0; 1278 fhstatus MOUNTPROC_MNT(dirpath) = 1; 1279 mountlist MOUNTPROC_DUMP(void) = 2; 1280 void MOUNTPROC_UMNT(dirpath) = 3; 1281 void MOUNTPROC_UMNTALL(void) = 4; 1282 exportlist MOUNTPROC_EXPORT(void) = 5; 1283 } = 1; 1284} = 100005; 1285.DE 1286.KS 1287.NH 3 1288\&Do Nothing 1289.IX "mount server procedures" MNTPROC_NULL() "" \fIMNTPROC_NULL()\fP 1290.DS 1291void 1292MNTPROC_NULL(void) = 0; 1293.DE 1294.KE 1295This procedure does no work. It is made available in all RPC 1296services to allow server response testing and timing. 1297.KS 1298.NH 3 1299\&Add Mount Entry 1300.IX "mount server procedures" MNTPROC_MNT() "" \fIMNTPROC_MNT()\fP 1301.DS 1302fhstatus 1303MNTPROC_MNT(dirpath) = 1; 1304.DE 1305.KE 1306If the reply "status" is 0, then the reply "directory" contains the 1307file handle for the directory "dirname". This file handle may be 1308used in the NFS protocol. This procedure also adds a new entry to 1309the mount list for this client mounting "dirname". 1310.KS 1311.NH 3 1312\&Return Mount Entries 1313.IX "mount server procedures" MNTPROC_DUMP() "" \fIMNTPROC_DUMP()\fP 1314.DS 1315struct *mountlist { 1316 name hostname; 1317 dirpath directory; 1318 mountlist nextentry; 1319}; 1320 1321mountlist 1322MNTPROC_DUMP(void) = 2; 1323.DE 1324.KE 1325Returns the list of remote mounted filesystems. The "mountlist" 1326contains one entry for each "hostname" and "directory" pair. 1327.KS 1328.NH 3 1329\&Remove Mount Entry 1330.IX "mount server procedures" MNTPROC_UMNT() "" \fIMNTPROC_UMNT()\fP 1331.DS 1332void 1333MNTPROC_UMNT(dirpath) = 3; 1334.DE 1335.KE 1336Removes the mount list entry for the input "dirpath". 1337.KS 1338.NH 3 1339\&Remove All Mount Entries 1340.IX "mount server procedures" MNTPROC_UMNTALL() "" \fIMNTPROC_UMNTALL()\fP 1341.DS 1342void 1343MNTPROC_UMNTALL(void) = 4; 1344.DE 1345.KE 1346Removes all of the mount list entries for this client. 1347.KS 1348.NH 3 1349\&Return Export List 1350.IX "mount server procedures" MNTPROC_EXPORT() "" \fIMNTPROC_EXPORT()\fP 1351.DS 1352struct *groups { 1353 name grname; 1354 groups grnext; 1355}; 1356 1357struct *exportlist { 1358 dirpath filesys; 1359 groups groups; 1360 exportlist next; 1361}; 1362 1363exportlist 1364MNTPROC_EXPORT(void) = 5; 1365.DE 1366.KE 1367Returns a variable number of export list entries. Each entry 1368contains a filesystem name and a list of groups that are allowed to 1369import it. The filesystem name is in "filesys", and the group name 1370is in the list "groups". 1371.LP 1372Note: The exportlist should contain 1373more information about the status of the filesystem, such as a 1374read-only flag.
|