History log of /linux-master/fs/nfsd/nfsproc.c
Revision Date Author Comments
# 24d92de9 15-Feb-2024 Trond Myklebust <trond.myklebust@hammerspace.com>

nfsd: Fix NFSv3 atomicity bugs in nfsd_setattr()

The main point of the guarded SETATTR is to prevent races with other
WRITE and SETATTR calls. That requires that the check of the guard time
against the inode ctime be done after taking the inode lock.

Furthermore, we need to take into account the 32-bit nature of
timestamps in NFSv3, and the possibility that files may change at a
faster rate than once a second.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 507df40e 18-May-2023 Chuck Lever <chuck.lever@oracle.com>

NFSD: Hoist rq_vec preparation into nfsd_read()

Accrue the following benefits:

a) Deduplicate this common bit of code.

b) Don't prepare rq_vec for NFSv2 and NFSv3 spliced reads, which
don't use rq_vec. This is already the case for
nfsd4_encode_read().

c) Eventually, converting NFSD's read path to use a bvec iterator
will be simpler.

In the next patch, nfsd_iter_read() will replace nfsd_readv() for
all NFS versions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 65ba3d24 10-Jan-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Use per-CPU counters to tally server RPC counts

- Improves counting accuracy
- Reduces cross-CPU memory traffic

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# c1632a0f 12-Jan-2023 Christian Brauner <brauner@kernel.org>

fs: port ->setattr() to pass mnt_idmap

Convert to struct mnt_idmap.

Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.

Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.

Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.

Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>


# 5304930d 07-Jan-2023 Chuck Lever <chuck.lever@oracle.com>

NFSD: Use set_bit(RQ_DROPME)

The premise that "Once an svc thread is scheduled and executing an
RPC, no other processes will touch svc_rqst::rq_flags" is false.
svc_xprt_enqueue() examines the RQ_BUSY flag in scheduled nfsd
threads when determining which thread to wake up next.

Fixes: 9315564747cb ("NFSD: Use only RQ_DROPME to signal the need to drop a reply")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 93155647 26-Nov-2022 Chuck Lever <chuck.lever@oracle.com>

NFSD: Use only RQ_DROPME to signal the need to drop a reply

Clean up: NFSv2 has the only two usages of rpc_drop_reply in the
NFSD code base. Since NFSv2 is going away at some point, replace
these in order to simplify the "drop this reply?" check in
nfsd_dispatch().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>


# cb12fae1 18-Oct-2022 Jeff Layton <jlayton@kernel.org>

nfsd: move nfserrno() to vfs.c

nfserrno() is common to all nfs versions, but nfsproc.c is specifically
for NFSv2. Move it to vfs.c, and the prototype to vfs.h.

While we're in here, remove the #ifdef EDQUOT check in this function.
It's apparently a holdover from the initial merge of the nfsd code in
1997. No other place in the kernel checks that that symbol is defined
before using it, so I think we can dispense with it here.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 98124f5b 12-Sep-2022 Chuck Lever <chuck.lever@oracle.com>

NFSD: Refactor common code out of dirlist helpers

The dust has settled a bit and it's become obvious what code is
totally common between nfsd_init_dirlist_pages() and
nfsd3_init_dirlist_pages(). Move that common code to SUNRPC.

The new helper brackets the existing xdr_init_decode_pages() API.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 103cc1fa 12-Sep-2022 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Parametrize how much of argsize should be zeroed

Currently, SUNRPC clears the whole of .pc_argsize before processing
each incoming RPC transaction. Add an extra parameter to struct
svc_procedure to enable upper layers to reduce the amount of each
operation's argument structure that is zeroed by SUNRPC.

The size of struct nfsd4_compoundargs, in particular, is a lot to
clear on each incoming RPC Call. A subsequent patch will cut this
down to something closer to what NFSv2 and NFSv3 uses.

This patch should cause no behavior changes.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 9558f930 05-Sep-2022 NeilBrown <neilb@suse.de>

NFSD: drop fname and flen args from nfsd_create_locked()

nfsd_create_locked() does not use the "fname" and "flen" arguments, so
drop them from declaration and all callers.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 401bc1f9 01-Sep-2022 Chuck Lever <chuck.lever@oracle.com>

NFSD: Protect against send buffer overflow in NFSv2 READ

Since before the git era, NFSD has conserved the number of pages
held by each nfsd thread by combining the RPC receive and send
buffers into a single array of pages. This works because there are
no cases where an operation needs a large RPC Call message and a
large RPC Reply at the same time.

Once an RPC Call has been received, svc_process() updates
svc_rqst::rq_res to describe the part of rq_pages that can be
used for constructing the Reply. This means that the send buffer
(rq_res) shrinks when the received RPC record containing the RPC
Call is large.

A client can force this shrinkage on TCP by sending a correctly-
formed RPC Call header contained in an RPC record that is
excessively large. The full maximum payload size cannot be
constructed in that case.

Cc: <stable@vger.kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 00b44926 01-Sep-2022 Chuck Lever <chuck.lever@oracle.com>

NFSD: Protect against send buffer overflow in NFSv2 READDIR

Restore the previous limit on the @count argument to prevent a
buffer overflow attack.

Fixes: 53b1119a6e50 ("NFSD: Fix READDIR buffer overflow")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# debf16f0 26-Jul-2022 NeilBrown <neilb@suse.de>

NFSD: use explicit lock/unlock for directory ops

When creating or unlinking a name in a directory use explicit
inode_lock_nested() instead of fh_lock(), and explicit calls to
fh_fill_pre_attrs() and fh_fill_post_attrs(). This is already done
for renames, with lock_rename() as the explicit locking.

Also move the 'fill' calls closer to the operation that might change the
attributes. This way they are avoided on some error paths.

For the v2-only code in nfsproc.c, the fill calls are not replaced as
they aren't needed.

Making the locking explicit will simplify proposed future changes to
locking for directories. It also makes it easily visible exactly where
pre/post attributes are used - not all callers of fh_lock() actually
need the pre/post attributes.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 93adc1e3 26-Jul-2022 NeilBrown <neilb@suse.de>

NFSD: set attributes when creating symlinks

The NFS protocol includes attributes when creating symlinks.
Linux does store attributes for symlinks and allows them to be set,
though they are not used for permission checking.

NFSD currently doesn't set standard (struct iattr) attributes when
creating symlinks, but for NFSv4 it does set ACLs and security labels.
This is inconsistent.

To improve consistency, pass the provided attributes into nfsd_symlink()
and call nfsd_create_setattr() to set them.

NOTE: this results in a behaviour change for all NFS versions when the
client sends non-default attributes with a SYMLINK request. With the
Linux client, the only attributes are:
attr.ia_mode = S_IFLNK | S_IRWXUGO;
attr.ia_valid = ATTR_MODE;
so the final outcome will be unchanged. Other clients might sent
different attributes, and if they did they probably expect them to be
honoured.

We ignore any error from nfsd_create_setattr(). It isn't really clear
what should be done if a file is successfully created, but the
attributes cannot be set. NFS doesn't allow partial success to be
reported. Reporting failure is probably more misleading than reporting
success, so the status is ignored.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 7fe2a71d 26-Jul-2022 NeilBrown <neilb@suse.de>

NFSD: introduce struct nfsd_attrs

The attributes that nfsd might want to set on a file include 'struct
iattr' as well as an ACL and security label.
The latter two are passed around quite separately from the first, in
part because they are only needed for NFSv4. This leads to some
clumsiness in the code, such as the attributes NOT being set in
nfsd_create_setattr().

We need to keep the directory locked until all attributes are set to
ensure the file is never visibile without all its attributes. This need
combined with the inconsistent handling of attributes leads to more
clumsiness.

As a first step towards tidying this up, introduce 'struct nfsd_attrs'.
This is passed (by reference) to vfs.c functions that work with
attributes, and is assembled by the various nfs*proc functions which
call them. As yet only iattr is included, but future patches will
expand this.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 184416d4 15-Mar-2022 Dan Carpenter <dan.carpenter@oracle.com>

NFSD: prevent underflow in nfssvc_decode_writeargs()

Smatch complains:

fs/nfsd/nfsxdr.c:341 nfssvc_decode_writeargs()
warn: no lower bound on 'args->len'

Change the type to unsigned to prevent this issue.

Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 12bcbd40 18-Dec-2021 Jeff Layton <jeff.layton@primarydata.com>

nfsd: Retry once in nfsd_open on an -EOPENSTALE return

If we get back -EOPENSTALE from an NFSv4 open, then we either got some
unhandled error or the inode we got back was not the same as the one
associated with the dentry.

We really have no recourse in that situation other than to retry the
open, and if it fails to just return nfserr_stale back to the client.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# a2694e51 18-Dec-2021 Jeff Layton <jeff.layton@primarydata.com>

nfsd: Add errno mapping for EREMOTEIO

The NFS client can occasionally return EREMOTEIO when signalling issues
with the server. ...map to NFSERR_IO.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# b3d0db70 18-Dec-2021 Peng Tao <tao.peng@primarydata.com>

nfsd: map EBADF

Now that we have open file cache, it is possible that another client
deletes the file and DP will not know about it. Then IO to MDS would
fail with BADSTATEID and knfsd would start state recovery, which
should fail as well and then nfs read/write will fail with EBADF.
And it triggers a WARN() in nfserrno().

-----------[ cut here ]------------
WARNING: CPU: 0 PID: 13529 at fs/nfsd/nfsproc.c:758 nfserrno+0x58/0x70 [nfsd]()
nfsd: non-standard errno: -9
modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_connt
pata_acpi floppy
CPU: 0 PID: 13529 Comm: nfsd Tainted: G W 4.1.5-00307-g6e6579b #7
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
0000000000000000 00000000464e6c9c ffff88079085fba8 ffffffff81789936
0000000000000000 ffff88079085fc00 ffff88079085fbe8 ffffffff810a08ea
ffff88079085fbe8 ffff88080f45c900 ffff88080f627d50 ffff880790c46a48
all Trace:
[<ffffffff81789936>] dump_stack+0x45/0x57
[<ffffffff810a08ea>] warn_slowpath_common+0x8a/0xc0
[<ffffffff810a0975>] warn_slowpath_fmt+0x55/0x70
[<ffffffff81252908>] ? splice_direct_to_actor+0x148/0x230
[<ffffffffa02fb8c0>] ? fsid_source+0x60/0x60 [nfsd]
[<ffffffffa02f9918>] nfserrno+0x58/0x70 [nfsd]
[<ffffffffa02fba57>] nfsd_finish_read+0x97/0xb0 [nfsd]
[<ffffffffa02fc7a6>] nfsd_splice_read+0x76/0xa0 [nfsd]
[<ffffffffa02fcca1>] nfsd_read+0xc1/0xd0 [nfsd]
[<ffffffffa0233af2>] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc]
[<ffffffffa03073da>] nfsd3_proc_read+0xba/0x150 [nfsd]
[<ffffffffa02f7a03>] nfsd_dispatch+0xc3/0x210 [nfsd]
[<ffffffffa0233af2>] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc]
[<ffffffffa0232913>] svc_process_common+0x453/0x6f0 [sunrpc]
[<ffffffffa0232cc3>] svc_process+0x113/0x1b0 [sunrpc]
[<ffffffffa02f740f>] nfsd+0xff/0x170 [nfsd]
[<ffffffffa02f7310>] ? nfsd_destroy+0x80/0x80 [nfsd]
[<ffffffff810bf3a8>] kthread+0xd8/0xf0
[<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0
[<ffffffff817912a2>] ret_from_fork+0x42/0x70
[<ffffffff810bf2d0>] ? kthread_create_on_node+0x1b0/0x1b0

Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Lance Shelton <lance.shelton@hammerspace.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 6a2f7744 21-Dec-2021 Chuck Lever <chuck.lever@oracle.com>

NFSD: Fix zero-length NFSv3 WRITEs

The Linux NFS server currently responds to a zero-length NFSv3 WRITE
request with NFS3ERR_IO. It responds to a zero-length NFSv4 WRITE
with NFS4_OK and count of zero.

RFC 1813 says of the WRITE procedure's @count argument:

count
The number of bytes of data to be written. If count is
0, the WRITE will succeed and return a count of 0,
barring errors due to permissions checking.

RFC 8881 has similar language for NFSv4, though NFSv4 removed the
explicit @count argument because that value is already contained in
the opaque payload array.

The synthetic client pynfs's WRT4 and WRT15 tests do emit zero-
length WRITEs to exercise this spec requirement. Commit fdec6114ee1f
("nfsd4: zero-length WRITE should succeed") addressed the same
problem there with the same fix.

But interestingly the Linux NFS client does not appear to emit zero-
length WRITEs, instead squelching them. I'm not aware of a test that
can generate such WRITEs for NFSv3, so I wrote a naive C program to
generate a zero-length WRITE and test this fix.

Fixes: 8154ef2776aa ("NFSD: Clean up legacy NFS WRITE argument XDR decoders")
Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 53b1119a 16-Dec-2021 Chuck Lever <chuck.lever@oracle.com>

NFSD: Fix READDIR buffer overflow

If a client sends a READDIR count argument that is too small (say,
zero), then the buffer size calculation in the new init_dirlist
helper functions results in an underflow, allowing the XDR stream
functions to write beyond the actual buffer.

This calculation has always been suspect. NFSD has never sanity-
checked the READDIR count argument, but the old entry encoders
managed the problem correctly.

With the commits below, entry encoding changed, exposing the
underflow to the pointer arithmetic in xdr_reserve_space().

Modern NFS clients attempt to retrieve as much data as possible
for each READDIR request. Also, we have no unit tests that
exercise the behavior of READDIR at the lower bound of @count
values. Thus this case was missed during testing.

Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Fixes: f5dcccd647da ("NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream")
Fixes: 7f87fc2d34d4 ("NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# dae9a6ca 30-Sep-2021 Chuck Lever <chuck.lever@oracle.com>

NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment()

Refactor.

Now that the NFSv2 and NFSv3 XDR decoders have been converted to
use xdr_streams, the WRITE decoder functions can use
xdr_stream_subsegment() to extract the WRITE payload into its own
xdr_buf, just as the NFSv4 WRITE XDR decoder currently does.

That makes it possible to pass the first kvec, pages array + length,
page_base, and total payload length via a single function parameter.

The payload's page_base is not yet assigned or used, but will be in
subsequent patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# bb0a55bb 20-Aug-2021 J. Bruce Fields <bfields@redhat.com>

nfs: don't allow reexport reclaims

In the reexport case, nfsd is currently passing along locks with the
reclaim bit set. The client sends a new lock request, which is granted
if there's currently no conflict--even if it's possible a conflicting
lock could have been briefly held in the interim.

We don't currently have any way to safely grant reclaim, so for now
let's just deny them all.

I'm doing this by passing the reclaim bit to nfs and letting it fail the
call, with the idea that eventually the client might be able to do
something more forgiving here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 6e3e2c43 01-Mar-2021 Al Viro <viro@zeniv.linux.org.uk>

new helper: inode_wrong_type()

inode_wrong_type(inode, mode) returns true if setting inode->i_mode
to given value would've changed the inode type. We have enough of
those checks open-coded to make a helper worthwhile.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 8a2cf9f5 15-Nov-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Remove unused NFSv2 directory entry encoders

Clean up.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# f5dcccd6 14-Nov-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 8141d6a2 13-Nov-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Count bytes instead of pages in the NFSv2 READDIR encoder

Clean up: Counting the bytes used by each returned directory entry
seems less brittle to me than trying to measure consumed pages after
the fact.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# d5253200 13-Nov-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Add a helper that encodes NFSv3 directory offset cookies

Refactor: Add helper function similar to nfs3svc_encode_cookie3().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# a6f8d9dc 23-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Update the NFSv2 READ result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# d9014b0f 23-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Update the NFSv2 READLINK result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 92b54a4f 23-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Update the NFSv2 attrstat encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# a887eaed 23-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Update the NFSv2 stat encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 2f221d6f 21-Jan-2021 Christian Brauner <christian.brauner@ubuntu.com>

attr: handle idmapped mounts

When file attributes are changed most filesystems rely on the
setattr_prepare(), setattr_copy(), and notify_change() helpers for
initialization and permission checking. Let them handle idmapped mounts.
If the inode is accessed through an idmapped mount map it into the
mount's user namespace. Afterwards the checks are identical to
non-idmapped mounts. If the initial user namespace is passed nothing
changes so non-idmapped mounts will see identical behavior as before.

Helpers that perform checks on the ia_uid and ia_gid fields in struct
iattr assume that ia_uid and ia_gid are intended values and have already
been mapped correctly at the userspace-kernelspace boundary as we
already do today. If the initial user namespace is passed nothing
changes so non-idmapped mounts will see identical behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-8-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>


# 788cd46e 13-Nov-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Add helper to set up the pages where the dirlist is encoded

Add a helper similar to nfsd3_init_dirlist_pages().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 1fcbd1c9 20-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Update the NFSv2 READLINK argument decoder to use struct xdr_stream

If the code that sets up the sink buffer for nfsd_readlink() is
moved adjacent to the nfsd_readlink() call site that uses it, then
the only argument is a file handle, and the fhandle decoder can be
used instead.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 8c293ef9 20-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Update the NFSv2 READ argument decoder to use struct xdr_stream

The code that sets up rq_vec is refactored so that it is now
adjacent to the nfsd_read() call site where it is used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# ebcd8e8b 20-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Update the NFSv2 GETATTR argument decoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 2289e87b 17-Sep-2020 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Make trace_svc_process() display the RPC procedure symbolically

The next few patches will employ these strings to help make server-
side trace logs more human-readable. A similar technique is already
in use in kernel RPC client code.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 788f7183 05-Nov-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Add common helpers to decode void args and encode void results

Start off the conversion to xdr_stream by de-duplicating the functions
that decode void arguments and encode void results.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# cc028a10 02-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Hoist status code encoding into XDR encoder functions

The original intent was presumably to reduce code duplication. The
trade-off was:

- No support for an NFSD proc function returning a non-success
RPC accept_stat value.
- No support for void NFS replies to non-NULL procedures.
- Everyone pays for the deduplication with a few extra conditional
branches in a hot path.

In addition, nfsd_dispatch() leaves *statp uninitialized in the
success path, unlike svc_generic_dispatch().

Address all of these problems by moving the logic for encoding
the NFS status code into the NFS XDR encoders themselves. Then
update the NFS .pc_func methods to return an RPC accept_stat
value.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# f0af2210 01-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Call NFSv2 encoders on error returns

Remove special dispatcher logic for NFSv2 error responses. These are
rare to the point of becoming extinct, but all NFS responses have to
pay the cost of the extra conditional branches.

With this change, the NFSv2 error cases now get proper
xdr_ressize_check() calls.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 1841b9b6 01-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Fix .pc_release method for NFSv2

nfsd_release_fhandle() assumes that rqstp->rq_resp always points to
an nfsd_fhandle struct. In fact, no NFSv2 procedure uses struct
nfsd_fhandle as its response structure.

So far that has been "safe" to do because the res structs put the
resp->fh field at that same offset as struct nfsd_fhandle. I don't
think that's a guarantee, though, and there is certainly nothing
preventing a developer from altering the fields in those structures.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 7cf83570 01-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Remove vestigial typedefs

Clean up: These are not used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 6b3dccd4 01-Oct-2020 Chuck Lever <chuck.lever@oracle.com>

NFSD: Add missing NFSv2 .pc_func methods

There's no protection in nfsd_dispatch() against a NULL .pc_func
helpers. A malicious NFS client can trigger a crash by invoking the
unused/unsupported NFSv2 ROOT or WRITECACHE procedures.

The current NFSD dispatcher does not support returning a void reply
to a non-NULL procedure, so the reply to both of these is wrong, for
the moment.

Cc: <stable@vger.kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# df561f66 23-Aug-2020 Gustavo A. R. Silva <gustavoars@kernel.org>

treewide: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>


# 19e0663f 06-Jan-2020 Trond Myklebust <trondmy@gmail.com>

nfsd: Ensure sampling of the write verifier is atomic with the write

When doing an unstable write, we need to ensure that we sample the
write verifier before releasing the lock, and allowing a commit to
the same file to proceed.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# b6356d42 03-Nov-2019 Arnd Bergmann <arnd@arndb.de>

nfsd: use time64_t in nfsd_proc_setattr() check

Change to time64_t and ktime_get_real_seconds() to make the
logic work correctly on 32-bit architectures beyond 2038.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 2a1aa489 03-Nov-2019 Arnd Bergmann <arnd@arndb.de>

nfsd: pass a 64-bit guardtime to nfsd_setattr()

Guardtime handling in nfs3 differs between 32-bit and 64-bit
architectures, and uses the deprecated time_t type.

Change it to using time64_t, which behaves the same way on
64-bit and 32-bit architectures, treating the number as an
unsigned 32-bit entity with a range of year 1970 to 2106
consistently, and avoiding the y2038 overflow.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 83a63072 26-Aug-2019 Trond Myklebust <trondmy@gmail.com>

nfsd: fix nfs read eof detection

Currently, the knfsd server assumes that a short read indicates an
end of file. That assumption is incorrect. The short read means that
either we've hit the end of file, or we've hit a read error.

In the case of a read error, the client may want to retry (as per the
implementation recommendations in RFC1813 and RFC7530), but currently it
is being told that it hit an eof.

Move the code to detect eof from version specific code into the generic
nfsd read.

Report eof only in the two following cases:
1) read() returns a zero length short read with no error.
2) the offset+length of the read is >= the file size.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 11b4d66e 27-Jul-2018 Chuck Lever <chuck.lever@oracle.com>

NFSD: Handle full-length symlinks

I've given up on the idea of zero-copy handling of SYMLINK on the
server side. This is because the Linux VFS symlink API requires the
symlink pathname to be in a NUL-terminated kmalloc'd buffer. The
NUL-termination is going to be problematic (watching out for
landing on a page boundary and dealing with a 4096-byte pathname).

I don't believe that SYMLINK creation is on a performance path or is
requested frequently enough that it will cause noticeable CPU cache
pollution due to data copies.

There will be two places where a transport callout will be necessary
to fill in the rqstp: one will be in the svc_fill_symlink_pathname()
helper that is used by NFSv2 and NFSv3, and the other will be in
nfsd4_decode_create().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 3fd9557a 27-Jul-2018 Chuck Lever <chuck.lever@oracle.com>

NFSD: Refactor the generic write vector fill helper

fill_in_write_vector() is nearly the same logic as
svc_fill_write_vector(), but there are a few differences so that
the former can handle multiple WRITE payloads in a single COMPOUND.

svc_fill_write_vector() can be adjusted so that it can be used in
the NFSv4 WRITE code path too. Instead of assuming the pages are
coming from rq_args.pages, have the caller pass in the page list.

The immediate benefit is a reduction of code duplication. It also
prevents the NFSv4 WRITE decoder from passing an empty vector
element when the transport has provided the payload in the xdr_buf's
page array.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 38a70315 27-Mar-2018 Chuck Lever <chuck.lever@oracle.com>

NFSD: Clean up legacy NFS SYMLINK argument XDR decoders

Move common code in NFSD's legacy SYMLINK decoders into a helper.
The immediate benefits include:

- one fewer data copies on transports that support DDP
- consistent error checking across all versions
- reduction of code duplication
- support for both legal forms of SYMLINK requests on RDMA
transports for all versions of NFS (in particular, NFSv2, for
completeness)

In the long term, this helper is an appropriate spot to perform a
per-transport call-out to fill the pathname argument using, say,
RDMA Reads.

Filling the pathname in the proc function also means that eventually
the incoming filehandle can be interpreted so that filesystem-
specific memory can be allocated as a sink for the pathname
argument, rather than using anonymous pages.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 8154ef27 27-Mar-2018 Chuck Lever <chuck.lever@oracle.com>

NFSD: Clean up legacy NFS WRITE argument XDR decoders

Move common code in NFSD's legacy NFS WRITE decoders into a helper.
The immediate benefit is reduction of code duplication and some nice
micro-optimizations (see below).

In the long term, this helper can perform a per-transport call-out
to fill the rq_vec (say, using RDMA Reads).

The legacy WRITE decoders and procs are changed to work like NFSv4,
which constructs the rq_vec just before it is about to call
vfs_writev.

Why? Calling a transport call-out from the proc instead of the XDR
decoder means that the incoming FH can be resolved to a particular
filesystem and file. This would allow pages from the backing file to
be presented to the transport to be filled, rather than presenting
anonymous pages and copying or flipping them into the file's page
cache later.

I also prefer using the pages in rq_arg.pages, instead of pulling
the data pages directly out of the rqstp::rq_pages array. This is
currently the way the NFSv3 write decoder works, but the other two
do not seem to take this approach. Fixing this removes the only
reference to rq_pages found in NFSD, eliminating an NFSD assumption
about how transports use the pages in rq_pages.

Lastly, avoid setting up the first element of rq_vec as a zero-
length buffer. This happens with an RDMA transport when a normal
Read chunk is present because the data payload is in rq_arg's
page list (none of it is in the head buffer).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# b2441318 01-Nov-2017 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that became
the concluded license(s).

- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# aa8217d5 12-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: mark all struct svc_version instances as const

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>


# b9c744c1 12-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: mark all struct svc_procinfo instances as const

struct svc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# 0becc118 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: move pc_count out of struct svc_procinfo

pc_count is the only writeable memeber of struct svc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct svc_procinfo, and into a
separate writable array that is pointed to by struct svc_version.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# d16d1867 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: properly type pc_encode callbacks

Drop the resp argument as it can trivially be derived from the rqstp
argument. With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>


# cc6acc20 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: properly type pc_decode callbacks

Drop the argp argument as it can trivially be derived from the rqstp
argument. With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# 1150ded8 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: properly type pc_release callbacks

Drop the p and resp arguments as they are always NULL or can trivially
be derived from the rqstp argument. With that all functions now have the
same prototype, and we can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# 1c8a5409 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: properly type pc_func callbacks

Drop the argp and resp arguments as they can trivially be derived from
the rqstp argument. With that all functions now have the same prototype,
and we can remove the unsafe casting to svc_procfunc as well as the
svc_procfunc typedef itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# e9679189 12-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: mark all struct svc_version instances as const

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 860bda29 12-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: mark all struct svc_procinfo instances as const

struct svc_procinfo contains function pointers, and marking it as
constant avoids it being able to be used as an attach vector for
code injections.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# 7fd38af9 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: move pc_count out of struct svc_procinfo

pc_count is the only writeable memeber of struct svc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct svc_procinfo, and into a
separate writable array that is pointed to by struct svc_version.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# 63f8de37 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: properly type pc_encode callbacks

Drop the resp argument as it can trivially be derived from the rqstp
argument. With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 026fec7e 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: properly type pc_decode callbacks

Drop the argp argument as it can trivially be derived from the rqstp
argument. With that all functions now have the same prototype, and we
can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# 8537488b 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: properly type pc_release callbacks

Drop the p and resp arguments as they are always NULL or can trivially
be derived from the rqstp argument. With that all functions now have the
same prototype, and we can remove the unsafe casting to kxdrproc_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# a6beb732 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: properly type pc_func callbacks

Drop the argp and resp arguments as they can trivially be derived from
the rqstp argument. With that all functions now have the same prototype,
and we can remove the unsafe casting to svc_procfunc as well as the
svc_procfunc typedef itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# c952cd4e 09-Mar-2017 Kinglong Mee <kinglongmee@gmail.com>

nfsd: map the ENOKEY to nfserr_perm for avoiding warning

Now that Ext4 and f2fs filesystems support encrypted directories and
files, attempts to access those files may return ENOKEY, resulting in
the following WARNING.

Map ENOKEY to nfserr_perm instead of nfserr_io.

[ 1295.411759] ------------[ cut here ]------------
[ 1295.411787] WARNING: CPU: 0 PID: 12786 at fs/nfsd/nfsproc.c:796 nfserrno+0x74/0x80 [nfsd]
[ 1295.411806] nfsd: non-standard errno: -126
[ 1295.411816] Modules linked in: nfsd nfs_acl auth_rpcgss nfsv4 nfs lockd fscache tun bridge stp llc fuse ip_set nfnetlink vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event coretemp crct10dif_pclmul crc32_generic crc32_pclmul snd_ens1371 gameport ghash_clmulni_intel snd_ac97_codec f2fs intel_rapl_perf ac97_bus snd_seq ppdev snd_pcm snd_rawmidi snd_timer vmw_balloon snd_seq_device snd joydev soundcore parport_pc parport nfit acpi_cpufreq tpm_tis vmw_vmci tpm_tis_core tpm shpchp i2c_piix4 grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel e1000 mptspi scsi_transport_spi serio_raw mptscsih mptbase ata_generic pata_acpi fjes [last unloaded: nfs_acl]
[ 1295.412522] CPU: 0 PID: 12786 Comm: nfsd Tainted: G W 4.11.0-rc1+ #521
[ 1295.412959] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[ 1295.413814] Call Trace:
[ 1295.414252] dump_stack+0x63/0x86
[ 1295.414666] __warn+0xcb/0xf0
[ 1295.415087] warn_slowpath_fmt+0x5f/0x80
[ 1295.415502] ? put_filp+0x42/0x50
[ 1295.415927] nfserrno+0x74/0x80 [nfsd]
[ 1295.416339] nfsd_open+0xd7/0x180 [nfsd]
[ 1295.416746] nfs4_get_vfs_file+0x367/0x3c0 [nfsd]
[ 1295.417182] ? security_inode_permission+0x41/0x60
[ 1295.417591] nfsd4_process_open2+0x9b2/0x1200 [nfsd]
[ 1295.418007] nfsd4_open+0x481/0x790 [nfsd]
[ 1295.418409] nfsd4_proc_compound+0x395/0x680 [nfsd]
[ 1295.418812] nfsd_dispatch+0xb8/0x1f0 [nfsd]
[ 1295.419233] svc_process_common+0x4d9/0x830 [sunrpc]
[ 1295.419631] svc_process+0xfe/0x1b0 [sunrpc]
[ 1295.420033] nfsd+0xe9/0x150 [nfsd]
[ 1295.420420] kthread+0x101/0x140
[ 1295.420802] ? nfsd_destroy+0x60/0x60 [nfsd]
[ 1295.421199] ? kthread_park+0x90/0x90
[ 1295.421598] ret_from_fork+0x2c/0x40
[ 1295.421996] ---[ end trace 0d5a969cd7852e1f ]---

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 52e380e0 31-Dec-2016 Kinglong Mee <kinglongmee@gmail.com>

NFSD: cleanup dead codes and values in nfsd_write

This is just cleanup, no change in functionality.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 54bbb7d2 31-Dec-2016 Kinglong Mee <kinglongmee@gmail.com>

NFSD: pass an integer for stable type to nfsd_vfs_write

After fae5096ad217 "nfsd: assume writeable exportabled filesystems have
f_sync" we no longer modify this argument.

This is just cleanup, no change in functionality.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 42e61616 04-Oct-2016 J. Bruce Fields <bfields@redhat.com>

nfsd: handle EUCLEAN

Eric Sandeen reports that xfs can return this if filesystem corruption
prevented completing the operation.

Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# ff30f08c 03-Oct-2016 J. Bruce Fields <bfields@redhat.com>

nfsd: only WARN once on unmapped errors

No need to spam the logs here.

The only drawback is losing information if we ever encounter two
different unmapped errors, but in practice we've rarely see even one.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 31051c85 26-May-2016 Jan Kara <jack@suse.cz>

fs: Give dentry to inode_change_ok() instead of inode

inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>


# b44061d0 20-Jul-2016 J. Bruce Fields <bfields@redhat.com>

nfsd: reorganize nfsd_create

There's some odd logic in nfsd_create() that allows it to be called with
the parent directory either locked or unlocked. The only already-locked
caller is NFSv2's nfsd_proc_create(). It's less confusing to split out
the unlocked case into a separate function which the NFSv2 code can call
directly.

Also fix some comments while we're here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 12391d07 19-Jul-2016 J. Bruce Fields <bfields@redhat.com>

nfsd: remove redundant zero-length check from create

lookup_one_len already has this check.

The only effect of this patch is to return access instead of perm in the
0-length-filename case. I actually prefer nfserr_perm (or _inval?), but
I doubt anyone cares.

The isdotent check seems redundant too, but I worry that some client
might actually care about that strange nfserr_exist error.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# cc265089 08-May-2015 Andreas Gruenbacher <andreas.gruenbacher@gmail.com>

nfsd: Disable NFSv2 timestamp workaround for NFSv3+

NFSv2 can set the atime and/or mtime of a file to specific timestamps but not
to the server's current time. To implement the equivalent of utimes("file",
NULL), it uses a heuristic.

NFSv3 and later do support setting the atime and/or mtime to the server's
current time directly. The NFSv2 heuristic is still enabled, and causes
timestamps to be set wrong sometimes.

Fix this by moving the heuristic into the NFSv2 specific code. We can leave it
out of the create code path: the owner can always set timestamps arbitrarily,
and the workaround would never trigger.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 2b0143b5 17-Mar-2015 David Howells <dhowells@redhat.com>

VFS: normal filesystems (and lustre): d_inode() annotations

that's the bulk of filesystem drivers dealing with inodes of their own

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# b3fbfe0e 29-Jul-2014 Jeff Layton <jlayton@kernel.org>

nfsd: print status when nfsd4_open fails to open file it just created

It's possible for nfsd to fail opening a file that it has just created.
When that happens, we throw a WARN but it doesn't include any info about
the error code. Print the status code to give us a bit more info.

Our QA group hit some of these warnings under some very heavy stress
testing. My suspicion is that they hit the file-max limit, but it's hard
to know for sure. Go ahead and add a -ENFILE mapping to
nfserr_serverfault to make the error more distinct (and correct).

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 62814d6a 03-Jul-2014 Jeff Layton <jlayton@kernel.org>

nfsd: add a nfserrno mapping for -E2BIG to nfserr_fbig

I saw this pop up with some pynfs testing:

[ 123.609992] nfsd: non-standard errno: -7

...and -7 is -E2BIG. I think what happened is that XFS returned -E2BIG
due to some xattr operations with the ACL10 pynfs TEST (I guess it has
limited xattr size?).

Add a better mapping for that error since it's possible that we'll need
it. How about we convert it to NFSERR_FBIG? As Bruce points out, they
both have "BIG" in the name so it must be good.

Also, turn the printk in this function into a WARN() so that we can get
a bit more information about situations that don't have proper mappings.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 1e444f5b 01-Jul-2014 Kinglong Mee <kinglongmee@gmail.com>

NFSD: Remove iattr parameter from nfsd_symlink()

Commit db2e747b1499 (vfs: remove mode parameter from vfs_symlink())
have remove mode parameter from vfs_symlink.
So that, iattr isn't needed by nfsd_symlink now, just remove it.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 52ee0433 20-Jun-2014 J. Bruce Fields <bfields@redhat.com>

nfsd: let nfsd_symlink assume null-terminated data

Currently nfsd_symlink has a weird hack to serve callers who don't
null-terminate symlink data: it looks ahead at the next byte to see if
it's zero, and copies it to a new buffer to null-terminate if not.

That means callers don't have to null-terminate, but they *do* have to
ensure that the byte following the end of the data is theirs to read.

That's a bit subtle, and the NFSv4 code actually got this wrong.

So let's just throw out that code and let callers pass null-terminated
strings; we've already fixed them to do that.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 0aeae33f 20-Jun-2014 J. Bruce Fields <bfields@redhat.com>

nfsd: make NFSv2 null terminate symlink data

It's simple enough for NFSv2 to null-terminate the symlink data.

A bit weird (it depends on knowing that we've already read the following
byte, which is either padding or part of the mode), but no worse than
the conditional kstrdup it otherwise relies on in nfsd_symlink().

Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 3dadecce 24-Jan-2013 Al Viro <viro@zeniv.linux.org.uk>

switch vfs_getattr() to struct path

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 4a55c101 12-Jun-2012 Jan Kara <jack@suse.cz>

nfsd: Push mnt_want_write() outside of i_mutex

When mnt_want_write() starts to handle freezing it will get a full lock
semantics requiring proper lock ordering. So push mnt_want_write() call
consistently outside of i_mutex.

CC: linux-nfs@vger.kernel.org
CC: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 3c726023 04-Jan-2011 J. Bruce Fields <bfields@redhat.com>

nfsd4: return nfs errno from name_to_id functions

This avoids the need for the confusing ESRCH mapping.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# f6af99ec 04-Jan-2011 J. Bruce Fields <bfields@redhat.com>

nfsd4: name->id mapping should fail with BADOWNER not BADNAME

According to rfc 3530 BADNAME is for strings that represent paths;
BADOWNER is for user/group names that don't map.

And the too-long name should probably be BADOWNER as well; it's
effectively the same as if we couldn't map it.

Cc: stable@kernel.org
Reported-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reported-by: Simon Kirby <sim@hostway.ca>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 062304a8 02-Jan-2011 J. Bruce Fields <bfields@redhat.com>

nfsd: stop translating EAGAIN to nfserr_dropit

We no longer need this.

Also, EWOULDBLOCK is generally a synonym for EAGAIN, but that may not be
true on all architectures, so map it as well.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 3beb6cd1 01-Jan-2011 J. Bruce Fields <bfields@redhat.com>

nfsd: don't drop requests on -ENOMEM

We never want to drop a request if we could return a JUKEBOX/DELAY error
instead; so, convert to nfserr_jukebox and let nfsd_dispatch() convert
that to a dropit error as a last resort if JUKEBOX/DELAY is unavailable
(as in the NFSv2 case).

Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 039a87ca 30-Jul-2010 J. Bruce Fields <bfields@redhat.com>

nfsd: minor nfsd read api cleanup

Christoph points that the NFSv2/v3 callers know which case they want
here, so we may as well just call the file=NULL case directly instead of
making this conditional.

Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 69049961 20-Jul-2010 Andi Kleen <andi@firstfloor.org>

gcc-4.6: nfsd: fix initialized but not read warnings

Fixes at least one real minor bug: the nfs4 recovery dir sysctl
would not return its status properly.

Also I finished Al's 1e41568d7378d ("Take ima_path_check() in nfsd
past dentry_open() in nfsd_open()") commit, it moved the IMA
code, but left the old path initializer in there.

The rest is just dead code removed I think, although I was not
fully sure about the "is_borc" stuff. Some more review
would be still good.

Found by gcc 4.6's new warnings.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 7663dacd 04-Dec-2009 J. Bruce Fields <bfields@citi.umich.edu>

nfsd: remove pointless paths in file headers

The new .h files have paths at the top that are now out of date. While
we're here, just remove all of those from fs/nfsd; they never served any
purpose.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 9a74af21 03-Dec-2009 Boaz Harrosh <bharrosh@panasas.com>

nfsd: Move private headers to source directory

Lots of include/linux/nfsd/* headers are only used by
nfsd module. Move them to the source directory

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 341eb184 03-Dec-2009 Boaz Harrosh <bharrosh@panasas.com>

nfsd: Source files #include cleanups

Now that the headers are fixed and carry their own wait, all fs/nfsd/
source files can include a minimal set of headers. and still compile just
fine.

This patch should improve the compilation speed of the nfsd module.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 0a3adade 04-Nov-2009 J. Bruce Fields <bfields@citi.umich.edu>

nfsd: make fs/nfsd/vfs.h for common includes

None of this stuff is used outside nfsd, so move it out of the common
linux include directory.

Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really
belongs there, so later we may remove that file entirely.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# f39bde24 27-Sep-2009 J. Bruce Fields <bfields@citi.umich.edu>

nfsd4: fix error return when pseudoroot missing

We really shouldn't hit this case at all, and forthcoming kernel and
nfs-utils changes should eliminate this case; if it does happen,
consider it a bug rather than reporting an error that doesn't really
make sense for the operation (since there's no reason for a server to be
accepting v4 traffic yet have no root filehandle).

Also move some exp_pseudoroot code into a helper function while we're
here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# b9081d90 09-Jun-2009 Yu Zhiguo <yuzg@cn.fujitsu.com>

NFS: kill off complicated macro 'PROC'

kill off obscure macro 'PROC' of NFSv2&3 in order to make the code more clear.

Among other things, this makes it simpler to grep for callers of these
functions--something which has frequently caused confusion among nfs
developers.

Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 31dec253 05-Mar-2009 David Shaw <dshaw@jabberwocky.com>

Short write in nfsd becomes a full write to the client

If a filesystem being written to via NFS returns a short write count
(as opposed to an error) to nfsd, nfsd treats that as a success for
the entire write, rather than the short count that actually succeeded.

For example, given a 8192 byte write, if the underlying filesystem
only writes 4096 bytes, nfsd will ack back to the nfs client that all
8192 bytes were written. The nfs client does have retry logic for
short writes, but this is never called as the client is told the
complete write succeeded.

There are probably other ways it could happen, but in my case it
happened with a fuse (filesystem in userspace) filesystem which can
rather easily have a partial write.

Here is a patch to properly return the short write count to the
client.

Signed-off-by: David Shaw <dshaw@jabberwocky.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# b7aeda40 15-Dec-2008 Dean Hildebrand <dhildeb@us.ibm.com>

nfsd: add etoosmall to nfserrno

Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 04716e66 07-Aug-2008 J. Bruce Fields <bfields@citi.umich.edu>

nfsd: permit unauthenticated stat of export root

RFC 2623 section 2.3.2 permits the server to bypass gss authentication
checks for certain operations that a client may perform when mounting.
In the case of a client that doesn't have some form of credentials
available to it on boot, this allows it to perform the mount unattended.
(Presumably real file access won't be needed until a user with
credentials logs in.)

Being slightly more lenient allows lots of old clients to access
krb5-only exports, with the only loss being a small amount of
information leaked about the root directory of the export.

This affects only v2 and v3; v4 still requires authentication for all
access.

Thanks to Peter Staubach testing against a Solaris client, which
suggesting addition of v3 getattr, to the list, and to Trond for noting
that doing so exposes no additional information.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Peter Staubach <staubach@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>


# 8837abca 16-Jun-2008 Miklos Szeredi <mszeredi@suse.cz>

nfsd: rename MAY_ flags

Rename nfsd_permission() specific MAY_* flags to NFSD_MAY_* to make it
clear, that these are not used outside nfsd, and to avoid name and
number space conflicts with the VFS.

[comment from hch: rename MAY_READ, MAY_WRITE and MAY_EXEC as well]

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 599eb304 18-Jun-2008 NeilBrown <neilb@suse.de>

knfsd: nfsd: Handle ERESTARTSYS from syscalls.

OCFS2 can return -ERESTARTSYS from write requests (and possibly
elsewhere) if there is a signal pending.

If nfsd is shutdown (by sending a signal to each thread) while there
is still an IO load from the client, each thread could handle one last
request with a signal pending. This can result in -ERESTARTSYS
which is not understood by nfserrno() and so is reflected back to
the client as nfserr_io aka -EIO. This is wrong.

Instead, interpret ERESTARTSYS to mean "try again later" by returning
nfserr_jukebox. The client will resend and - if the server is
restarted - the write will (hopefully) be successful and everyone will
be happy.

The symptom that I narrowed down to this was:
copy a large file via NFS to an OCFS2 filesystem, and restart
the nfs server during the copy.
The 'cp' might get an -EIO, and the file will be corrupted -
presumably holes in the middle where writes appeared to fail.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 54775491 14-Feb-2008 Jan Blunck <jblunck@suse.de>

Use struct path in struct svc_export

I'm embedding struct path into struct svc_export.

[akpm@linux-foundation.org: coding-style fixes]
[ezk@cs.sunysb.edu: NFSD: fix wrong mnt_writer count in rename]
Signed-off-by: Jan Blunck <jblunck@suse.de>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 0ec757df 17-Jul-2007 J. Bruce Fields <bfields@citi.umich.edu>

knfsd: nfsd4: make readonly access depend on pseudoflavor

Allow readonly access to vary depending on the pseudoflavor, using the flag
passed with each pseudoflavor in the export downcall. The rest of the flags
are ignored for now, though some day we might also allow id squashing to vary
based on the flavor.

Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# cd123012 09-May-2007 Jeff Layton <jlayton@kernel.org>

RPC: add wrapper for svc_reserve to account for checksum

When the kernel calls svc_reserve to downsize the expected size of an RPC
reply, it fails to account for the possibility of a checksum at the end of
the packet. If a client mounts a NFSv2/3 with sec=krb5i/p, and does I/O
then you'll generally see messages similar to this in the server's ring
buffer:

RPC request reserved 164 but used 208

While I was never able to verify it, I suspect that this problem is also
the root cause of some oopses I've seen under these conditions:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227726

This is probably also a problem for other sec= types and for NFSv4. The
large reserved size for NFSv4 compound packets seems to generally paper
over the problem, however.

This patch adds a wrapper for svc_reserve that accounts for the possibility
of a checksum. It also fixes up the appropriate callers of svc_reserve to
call the wrapper. For now, it just uses a hardcoded value that I
determined via testing. That value may need to be revised upward as things
change, or we may want to eventually add a new auth_op that attempts to
calculate this somehow.

Unfortunately, there doesn't seem to be a good way to reliably determine
the expected checksum length prior to actually calculating it, particularly
with schemes like spkm3.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# ad06e4bd 12-Feb-2007 Chuck Lever <chuck.lever@oracle.com>

[PATCH] knfsd: SUNRPC: Add a function to format the address in an svc_rqst for printing

There are loads of places where the RPC server assumes that the rq_addr fields
contains an IPv4 address. Top among these are error and debugging messages
that display the server's IP address.

Let's refactor the address printing into a separate function that's smart
enough to figure out the difference between IPv4 and IPv6 addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c4d987ba 20-Oct-2006 Al Viro <viro@ftp.linux.org.uk>

[PATCH] nfsd: NFSv{2,3} trivial endianness annotations for error values

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 63f10311 20-Oct-2006 Al Viro <viro@ftp.linux.org.uk>

[PATCH] nfsd: nfserrno() endianness annotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 7111c66e 20-Oct-2006 Al Viro <viro@ftp.linux.org.uk>

[PATCH] fix svc_procfunc declaration

svc_procfunc instances return __be32, not int

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 7adae489 04-Oct-2006 Greg Banks <gnb@melbourne.sgi.com>

[PATCH] knfsd: Prepare knfsd for support of rsize/wsize of up to 1MB, over TCP

The limit over UDP remains at 32K. Also, make some of the apparently
arbitrary sizing constants clearer.

The biggest change here involves replacing NFSSVC_MAXBLKSIZE by a function of
the rqstp. This allows it to be different for different protocols (udp/tcp)
and also allows it to depend on the servers declared sv_bufsiz.

Note that we don't actually increase sv_bufsz for nfs yet. That comes next.

Signed-off-by: Greg Banks <gnb@melbourne.sgi.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 3cc03b16 04-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: Avoid excess stack usage in svc_tcp_recvfrom

.. by allocating the array of 'kvec' in 'struct svc_rqst'.

As we plan to increase RPCSVC_MAXPAGES from 8 upto 256, we can no longer
allocate an array of this size on the stack. So we allocate it in 'struct
svc_rqst'.

However svc_rqst contains (indirectly) an array of the same type and size
(actually several, but they are in a union). So rather than waste space, we
move those arrays out of the separately allocated union and into svc_rqst to
share with the kvec moved out of svc_tcp_recvfrom (various arrays are used at
different times, so there is no conflict).

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 7ed94296 04-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: nfsd: lockdep annotation fix

nfsv2 needs the I_MUTEX_PARENT on the directory when creating a file too.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 7775f4c8 10-Apr-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: Correct reserved reply space for read requests.

NFSd makes sure there is enough space to hold the maximum possible reply
before accepting a request. The units for this maximum is (4byte) words.
However in three places, particularly for read request, the number given is
a number of bytes.

This means too much space is reserved which is slightly wasteful.

This is the sort of patch that could uncover a deeper bug, and it is not
critical, so it would be best for it to spend a while in -mm before going
in to mainline.

(akpm: target 2.6.17-rc2, 2.6.16.3 (approx))

Discovered-by: "Eivind Sarto" <ivan@kasenna.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 846f2fcd 18-Jan-2006 David Shaw <dshaw@jabberwocky.com>

[PATCH] knfsd: Provide missing NFSv2 part of patch for checking vfs_getattr.

A recent patch which checked the return status of vfs_getattr in nfsd,
completely missed the nfsproc.c (NFSv2) part. Here is it.

This patch moved the call to vfs_getattr from the xdr encoding (at which point
it is too late to return an error) to the call handling. This means several
calls to vfs_getattr are needed in nfsproc.c. Many are encapsulated in
nfsd_return_attrs and nfsd_return_dirop.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# a838cc49 22-Jun-2005 Andreas Gruenbacher <agruen@suse.de>

[PATCH] NFSD: Add NFS3ERR_NOTSUPP to the nfsd error mapping table

Add the missing NFS3ERR_NOTSUPP error code (defined in NFSv3) to the
system-to-protocol-error table in nfsd. The nfsacl extension uses this error
code.

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 1da177e4 16-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org>

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!