#
24d92de9 |
|
15-Feb-2024 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
nfsd: Fix NFSv3 atomicity bugs in nfsd_setattr() The main point of the guarded SETATTR is to prevent races with other WRITE and SETATTR calls. That requires that the check of the guard time against the inode ctime be done after taking the inode lock. Furthermore, we need to take into account the 32-bit nature of timestamps in NFSv3, and the possibility that files may change at a faster rate than once a second. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
11fec9b9 |
|
04-Oct-2023 |
Jeff Layton <jlayton@kernel.org> |
nfsd: convert to new timestamp accessors Convert to using the new inode timestamp accessor functions. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20231004185347.80880-50-jlayton@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
|
#
afb8aae5 |
|
27-Sep-2023 |
KaiLong Wang <wangkailong@jari.cn> |
nfsd: Clean up errors in nfs3proc.c Fix the following errors reported by checkpatch: ERROR: need consistent spacing around '+' (ctx:WxV) ERROR: spaces required around that '?' (ctx:VxW) Signed-off-by: KaiLong Wang <wangkailong@jari.cn> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
a332018a |
|
21-Jul-2023 |
Jeff Layton <jlayton@kernel.org> |
nfsd: handle failure to collect pre/post-op attrs more sanely Collecting pre_op_attrs can fail, in which case it's probably best to fail the whole operation. Change fh_fill_pre_attrs and fh_fill_both_attrs to return __be32, and have the callers check the return code and abort the operation if it's not nfs_ok. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
507df40e |
|
18-May-2023 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Hoist rq_vec preparation into nfsd_read() Accrue the following benefits: a) Deduplicate this common bit of code. b) Don't prepare rq_vec for NFSv2 and NFSv3 spliced reads, which don't use rq_vec. This is already the case for nfsd4_encode_read(). c) Eventually, converting NFSD's read path to use a bvec iterator will be simpler. In the next patch, nfsd_iter_read() will replace nfsd_readv() for all NFS versions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
65ba3d24 |
|
10-Jan-2023 |
Chuck Lever <chuck.lever@oracle.com> |
SUNRPC: Use per-CPU counters to tally server RPC counts - Improves counting accuracy - Reduces cross-CPU memory traffic Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
abf08576 |
|
12-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port vfs_*() helpers to struct mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
#
4d1ea845 |
|
28-Oct-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection NFSv4 operations manage the lifetime of nfsd_file items they use by means of NFSv4 OPEN and CLOSE. Hence there's no need for them to be garbage collected. Introduce a mechanism to enable garbage collection for nfsd_file items used only by NFSv2/3 callers. Note that the change in nfsd_file_put() ensures that both CLOSE and DELEGRETURN will actually close out and free an nfsd_file on last reference of a non-garbage-collected file. Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=394 Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org>
|
#
c2528490 |
|
28-Oct-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Pass the target nfsd_file to nfsd_commit() In a moment I'm going to introduce separate nfsd_file types, one of which is garbage-collected; the other, not. The garbage-collected variety is to be used by NFSv2 and v3, and the non-garbage-collected variety is to be used by NFSv4. nfsd_commit() is invoked by both NFSv3 and NFSv4 consumers. We want nfsd_commit() to find and use the correct variety of cached nfsd_file object for the NFS version that is in use. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de>
|
#
98124f5b |
|
12-Sep-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Refactor common code out of dirlist helpers The dust has settled a bit and it's become obvious what code is totally common between nfsd_init_dirlist_pages() and nfsd3_init_dirlist_pages(). Move that common code to SUNRPC. The new helper brackets the existing xdr_init_decode_pages() API. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
103cc1fa |
|
12-Sep-2022 |
Chuck Lever <chuck.lever@oracle.com> |
SUNRPC: Parametrize how much of argsize should be zeroed Currently, SUNRPC clears the whole of .pc_argsize before processing each incoming RPC transaction. Add an extra parameter to struct svc_procedure to enable upper layers to reduce the amount of each operation's argument structure that is zeroed by SUNRPC. The size of struct nfsd4_compoundargs, in particular, is a lot to clear on each incoming RPC Call. A subsequent patch will cut this down to something closer to what NFSv2 and NFSv3 uses. This patch should cause no behavior changes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
fa6be9cc |
|
01-Sep-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Protect against send buffer overflow in NFSv3 READ Since before the git era, NFSD has conserved the number of pages held by each nfsd thread by combining the RPC receive and send buffers into a single array of pages. This works because there are no cases where an operation needs a large RPC Call message and a large RPC Reply at the same time. Once an RPC Call has been received, svc_process() updates svc_rqst::rq_res to describe the part of rq_pages that can be used for constructing the Reply. This means that the send buffer (rq_res) shrinks when the received RPC record containing the RPC Call is large. A client can force this shrinkage on TCP by sending a correctly- formed RPC Call header contained in an RPC record that is excessively large. The full maximum payload size cannot be constructed in that case. Cc: <stable@vger.kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
640f87c1 |
|
01-Sep-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Protect against send buffer overflow in NFSv3 READDIR Since before the git era, NFSD has conserved the number of pages held by each nfsd thread by combining the RPC receive and send buffers into a single array of pages. This works because there are no cases where an operation needs a large RPC Call message and a large RPC Reply message at the same time. Once an RPC Call has been received, svc_process() updates svc_rqst::rq_res to describe the part of rq_pages that can be used for constructing the Reply. This means that the send buffer (rq_res) shrinks when the received RPC record containing the RPC Call is large. A client can force this shrinkage on TCP by sending a correctly- formed RPC Call header contained in an RPC record that is excessively large. The full maximum payload size cannot be constructed in that case. Thanks to Aleksi Illikainen and Kari Hulkko for uncovering this issue. Reported-by: Ben Ronallo <Benjamin.Ronallo@synopsys.com> Cc: <stable@vger.kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
debf16f0 |
|
26-Jul-2022 |
NeilBrown <neilb@suse.de> |
NFSD: use explicit lock/unlock for directory ops When creating or unlinking a name in a directory use explicit inode_lock_nested() instead of fh_lock(), and explicit calls to fh_fill_pre_attrs() and fh_fill_post_attrs(). This is already done for renames, with lock_rename() as the explicit locking. Also move the 'fill' calls closer to the operation that might change the attributes. This way they are avoided on some error paths. For the v2-only code in nfsproc.c, the fill calls are not replaced as they aren't needed. Making the locking explicit will simplify proposed future changes to locking for directories. It also makes it easily visible exactly where pre/post attributes are used - not all callers of fh_lock() actually need the pre/post attributes. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
b677c0c6 |
|
26-Jul-2022 |
NeilBrown <neilb@suse.de> |
NFSD: always drop directory lock in nfsd_unlink() Some error paths in nfsd_unlink() allow it to exit without unlocking the directory. This is not a problem in practice as the directory will be locked with an fh_put(), but it is untidy and potentially confusing. This allows us to remove all the fh_unlock() calls that are immediately after nfsd_unlink() calls. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
927bfc56 |
|
26-Jul-2022 |
NeilBrown <neilb@suse.de> |
NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning. nfsd_create() usually returns with the directory still locked. nfsd_symlink() usually returns with it unlocked. This is clumsy. Until recently nfsd_create() needed to keep the directory locked until ACLs and security label had been set. These are now set inside nfsd_create() (in nfsd_setattr()) so this need is gone. So change nfsd_create() and nfsd_symlink() to always unlock, and remove any fh_unlock() calls that follow calls to these functions. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
93adc1e3 |
|
26-Jul-2022 |
NeilBrown <neilb@suse.de> |
NFSD: set attributes when creating symlinks The NFS protocol includes attributes when creating symlinks. Linux does store attributes for symlinks and allows them to be set, though they are not used for permission checking. NFSD currently doesn't set standard (struct iattr) attributes when creating symlinks, but for NFSv4 it does set ACLs and security labels. This is inconsistent. To improve consistency, pass the provided attributes into nfsd_symlink() and call nfsd_create_setattr() to set them. NOTE: this results in a behaviour change for all NFS versions when the client sends non-default attributes with a SYMLINK request. With the Linux client, the only attributes are: attr.ia_mode = S_IFLNK | S_IRWXUGO; attr.ia_valid = ATTR_MODE; so the final outcome will be unchanged. Other clients might sent different attributes, and if they did they probably expect them to be honoured. We ignore any error from nfsd_create_setattr(). It isn't really clear what should be done if a file is successfully created, but the attributes cannot be set. NFS doesn't allow partial success to be reported. Reporting failure is probably more misleading than reporting success, so the status is ignored. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
7fe2a71d |
|
26-Jul-2022 |
NeilBrown <neilb@suse.de> |
NFSD: introduce struct nfsd_attrs The attributes that nfsd might want to set on a file include 'struct iattr' as well as an ACL and security label. The latter two are passed around quite separately from the first, in part because they are only needed for NFSv4. This leads to some clumsiness in the code, such as the attributes NOT being set in nfsd_create_setattr(). We need to keep the directory locked until all attributes are set to ensure the file is never visibile without all its attributes. This need combined with the inconsistent handling of attributes leads to more clumsiness. As a first step towards tidying this up, introduce 'struct nfsd_attrs'. This is passed (by reference) to vfs.c functions that work with attributes, and is assembled by the various nfs*proc functions which call them. As yet only iattr is included, but future patches will expand this. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
df9606ab |
|
28-Mar-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Refactor NFSv3 CREATE The NFSv3 CREATE and NFSv4 OPEN(CREATE) use cases are about to diverge such that it makes sense to split do_nfsd_create() into one version for NFSv3 and one for NFSv4. As a first step, copy do_nfsd_create() to nfs3proc.c and remove NFSv4-specific logic. One immediate legibility benefit is that the logic for handling NFSv3 createhow is now quite straightforward. NFSv4 createhow has some subtleties that IMO do not belong in generic code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
e6156859 |
|
25-Mar-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Clean up nfsd3_proc_create() As near as I can tell, mode bit masking and setting S_IFREG is already done by do_nfsd_create() and vfs_create(). The NFSv4 path (do_open_lookup), for example, does not bother with this special processing. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
3f965021 |
|
24-Jan-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: COMMIT operations must not return NFS?ERR_INVAL Since, well, forever, the Linux NFS server's nfsd_commit() function has returned nfserr_inval when the passed-in byte range arguments were non-sensical. However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests are permitted to return only the following non-zero status codes: NFS3ERR_IO NFS3ERR_STALE NFS3ERR_BADHANDLE NFS3ERR_SERVERFAULT NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL is not listed in the COMMIT row of Table 6 in RFC 8881. RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not specify when it can or should be used. Instead of dropping or failing a COMMIT request in a byte range that is not supported, turn it into a valid request by treating one or both arguments as zero. Offset zero means start-of-file, count zero means until-end-of-file, so we only ever extend the commit range. NFS servers are always allowed to commit more and sooner than requested. The range check is no longer bounded by NFS_OFFSET_MAX, but rather by the value that is returned in the maxfilesize field of the NFSv3 FSINFO procedure or the NFSv4 maxfilesize file attribute. Note that this change results in a new pynfs failure: CMT4 st_commit.testCommitOverflow : RUNNING CMT4 st_commit.testCommitOverflow : FAILURE COMMIT with offset + count overflow should return NFS4ERR_INVAL, instead got NFS4_OK IMO the test is not correct as written: RFC 8881 does not allow the COMMIT operation to return NFS4ERR_INVAL. Reported-by: Dan Aloni <dan.aloni@vastdata.com> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Bruce Fields <bfields@fieldses.org>
|
#
6260d9a5 |
|
25-Jan-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Clamp WRITE offsets Ensure that a client cannot specify a WRITE range that falls in a byte range outside what the kernel's internal types (such as loff_t, which is signed) can represent. The kiocb iterators, invoked in nfsd_vfs_write(), should properly limit write operations to within the underlying file system's s_maxbytes. Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
0cb4d23a |
|
04-Feb-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Fix the behavior of READ near OFFSET_MAX Dan Aloni reports: > Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to > the RPC read layers") on the client, a read of 0xfff is aligned up > to server rsize of 0x1000. > > As a result, in a test where the server has a file of size > 0x7fffffffffffffff, and the client tries to read from the offset > 0x7ffffffffffff000, the read causes loff_t overflow in the server > and it returns an NFS code of EINVAL to the client. The client as > a result indefinitely retries the request. The Linux NFS client does not handle NFS?ERR_INVAL, even though all NFS specifications permit servers to return that status code for a READ. Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed and return a short result. Set the EOF flag in the result to prevent the client from retrying the READ request. This behavior appears to be consistent with Solaris NFS servers. Note that NFSv3 and NFSv4 use u64 offset values on the wire. These must be converted to loff_t internally before use -- an implicit type cast is not adequate for this purpose. Otherwise VFS checks against sb->s_maxbytes do not work properly. Reported-by: Dan Aloni <dan.aloni@vastdata.com> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
6a2f7744 |
|
21-Dec-2021 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Fix zero-length NFSv3 WRITEs The Linux NFS server currently responds to a zero-length NFSv3 WRITE request with NFS3ERR_IO. It responds to a zero-length NFSv4 WRITE with NFS4_OK and count of zero. RFC 1813 says of the WRITE procedure's @count argument: count The number of bytes of data to be written. If count is 0, the WRITE will succeed and return a count of 0, barring errors due to permissions checking. RFC 8881 has similar language for NFSv4, though NFSv4 removed the explicit @count argument because that value is already contained in the opaque payload array. The synthetic client pynfs's WRT4 and WRT15 tests do emit zero- length WRITEs to exercise this spec requirement. Commit fdec6114ee1f ("nfsd4: zero-length WRITE should succeed") addressed the same problem there with the same fix. But interestingly the Linux NFS client does not appear to emit zero- length WRITEs, instead squelching them. I'm not aware of a test that can generate such WRITEs for NFSv3, so I wrote a naive C program to generate a zero-length WRITE and test this fix. Fixes: 8154ef2776aa ("NFSD: Clean up legacy NFS WRITE argument XDR decoders") Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
53b1119a |
|
16-Dec-2021 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Fix READDIR buffer overflow If a client sends a READDIR count argument that is too small (say, zero), then the buffer size calculation in the new init_dirlist helper functions results in an underflow, allowing the XDR stream functions to write beyond the actual buffer. This calculation has always been suspect. NFSD has never sanity- checked the READDIR count argument, but the old entry encoders managed the problem correctly. With the commits below, entry encoding changed, exposing the underflow to the pointer arithmetic in xdr_reserve_space(). Modern NFS clients attempt to retrieve as much data as possible for each READDIR request. Also, we have no unit tests that exercise the behavior of READDIR at the lower bound of @count values. Thus this case was missed during testing. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Fixes: f5dcccd647da ("NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream") Fixes: 7f87fc2d34d4 ("NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream") Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
dae9a6ca |
|
30-Sep-2021 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Have legacy NFSD WRITE decoders use xdr_stream_subsegment() Refactor. Now that the NFSv2 and NFSv3 XDR decoders have been converted to use xdr_streams, the WRITE decoder functions can use xdr_stream_subsegment() to extract the WRITE payload into its own xdr_buf, just as the NFSv4 WRITE XDR decoder currently does. That makes it possible to pass the first kvec, pages array + length, page_base, and total payload length via a single function parameter. The payload's page_base is not yet assigned or used, but will be in subsequent patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
76ed0dd9 |
|
15-Jan-2021 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Reduce svc_rqst::rq_pages churn during READDIR operations During NFSv2 and NFSv3 READDIR/PLUS operations, NFSD advances rq_next_page to the full size of the client-requested buffer, then releases all those pages at the end of the request. The next request to use that nfsd thread has to refill the pages. NFSD does this even when the dirlist in the reply is small. With NFSv3 clients that send READDIR operations with large buffer sizes, that can be 256 put_page/alloc_page pairs per READDIR request, even though those pages often remain unused. We can save some work by not releasing dirlist buffer pages that were not used to form the READDIR Reply. I've left the NFSv2 code alone since there are never more than three pages involved in an NFSv2 READDIR Reply. Eventually we should nail down why these pages need to be released at all in order to avoid allocating and releasing pages unnecessarily. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
7f87fc2d |
|
22-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream The benefit of the xdr_stream helpers is that they transparently handle encoding an XDR data item that crosses page boundaries. Most of the open-coded logic to do that here can be eliminated. A sub-buffer and sub-stream are set up as a sink buffer for the directory entry encoder. As an entry is encoded, it is added to the end of the content in this buffer/stream. The total length of the directory list is tracked in the buffer's @len field. When it comes time to encode the Reply, the sub-buffer is merged into rq_res's page array at the correct place using xdr_write_pages(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
e4ccfe30 |
|
22-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Update the NFSv3 READDIR3res encoder to use struct xdr_stream Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
a1409e2d |
|
09-Nov-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Count bytes instead of pages in the NFSv3 READDIR encoder Clean up: Counting the bytes used by each returned directory entry seems less brittle to me than trying to measure consumed pages after the fact. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
a161e6c7 |
|
10-Nov-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Add a helper that encodes NFSv3 directory offset cookies Refactor: De-duplicate identical code that handles encoding of directory offset cookies across page boundaries. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
cc9bcdad |
|
22-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Update the NFSv3 READ3res encode to use struct xdr_stream Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
9a9c8923 |
|
22-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Update the NFSv3 READLINK3res encoder to use struct xdr_stream Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
5cf35335 |
|
22-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Update the NFSv3 LOOKUP3res encoder to use struct xdr_stream Also, clean up: Rename the encoder function to match the name of the result structure in RFC 1813, consistent with other encoder function names in nfs3xdr.c. "diropres" is an NFSv2 thingie. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
2c42f804 |
|
21-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Update the GETATTR3res encoder to use struct xdr_stream As an additional clean up, some renaming is done to more closely reflect the data type and variable names used in the NFSv3 XDR definition provided in RFC 1813. "attrstat" is an NFSv2 thingie. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
40116ebd |
|
17-Nov-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Add helper to set up the pages where the dirlist is encoded De-duplicate some code that is used by both READDIR and READDIRPLUS to build the dirlist in the Reply. Because this code is not related to decoding READ arguments, it is moved to a more appropriate spot. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
0a8f37fb |
|
10-Nov-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Fix returned READDIR offset cookie Code inspection shows that the server's NFSv3 READDIR implementation handles offset cookies slightly differently than the NFSv2 READDIR, NFSv3 READDIRPLUS, and NFSv4 READDIR implementations, and there doesn't seem to be any need for this difference. As a clean up, I copied the logic from nfsd3_proc_readdirplus(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
224c1c89 |
|
23-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Update READLINK3arg decoder to use struct xdr_stream The NFSv3 READLINK request takes a single filehandle, so it can re-use GETATTR's decoder. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
be63bd2a |
|
20-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Update READ3arg decoder to use struct xdr_stream The code that sets up rq_vec is refactored so that it is now adjacent to the nfsd_read() call site where it is used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
9575363a |
|
20-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Update GETATTR3args decoder to use struct xdr_stream Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
2289e87b |
|
17-Sep-2020 |
Chuck Lever <chuck.lever@oracle.com> |
SUNRPC: Make trace_svc_process() display the RPC procedure symbolically The next few patches will employ these strings to help make server- side trace logs more human-readable. A similar technique is already in use in kernel RPC client code. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
788f7183 |
|
05-Nov-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Add common helpers to decode void args and encode void results Start off the conversion to xdr_stream by de-duplicating the functions that decode void arguments and encode void results. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
71fd7218 |
|
05-Nov-2020 |
Alex Shi <alex.shi@linux.alibaba.com> |
nfsd/nfs3: remove unused macro nfsd3_fhandleres The macro is unused, remove it to tame gcc warning: fs/nfsd/nfs3proc.c:702:0: warning: macro "nfsd3_fhandleres" is not used [-Wunused-macros] Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: linux-nfs@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
66d60e3a |
|
23-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: MKNOD should return NFSERR_BADTYPE instead of NFSERR_INVAL A late paragraph of RFC 1813 Section 3.3.11 states: | ... if the server does not support the target type or the | target type is illegal, the error, NFS3ERR_BADTYPE, should | be returned. Note that NF3REG, NF3DIR, and NF3LNK are | illegal types for MKNOD. The Linux NFS server incorrectly returns NFSERR_INVAL in these cases. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
cc028a10 |
|
02-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Hoist status code encoding into XDR encoder functions The original intent was presumably to reduce code duplication. The trade-off was: - No support for an NFSD proc function returning a non-success RPC accept_stat value. - No support for void NFS replies to non-NULL procedures. - Everyone pays for the deduplication with a few extra conditional branches in a hot path. In addition, nfsd_dispatch() leaves *statp uninitialized in the success path, unlike svc_generic_dispatch(). Address all of these problems by moving the logic for encoding the NFS status code into the NFS XDR encoders themselves. Then update the NFS .pc_func methods to return an RPC accept_stat value. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
14168d67 |
|
01-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Remove the RETURN_STATUS() macro Refactor: I'm about to change the return value from .pc_func. Clear the way by replacing the RETURN_STATUS() macro with logic that plants the status code directly into the response structure. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
dcc46991 |
|
01-Oct-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Encoder and decoder functions are always present nfsd_dispatch() is a hot path. Let's optimize the XDR method calls for the by-far common case, which is that the XDR methods are indeed present. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
19e0663f |
|
06-Jan-2020 |
Trond Myklebust <trondmy@gmail.com> |
nfsd: Ensure sampling of the write verifier is atomic with the write When doing an unstable write, we need to ensure that we sample the write verifier before releasing the lock, and allowing a commit to the same file to proceed. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
524ff1af |
|
06-Jan-2020 |
Trond Myklebust <trondmy@gmail.com> |
nfsd: Ensure sampling of the commit verifier is atomic with the commit When we have a successful commit, ensure we sample the commit verifier before releasing the lock. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
83a63072 |
|
26-Aug-2019 |
Trond Myklebust <trondmy@gmail.com> |
nfsd: fix nfs read eof detection Currently, the knfsd server assumes that a short read indicates an end of file. That assumption is incorrect. The short read means that either we've hit the end of file, or we've hit a read error. In the case of a read error, the client may want to retry (as per the implementation recommendations in RFC1813 and RFC7530), but currently it is being told that it hit an eof. Move the code to detect eof from version specific code into the generic nfsd read. Report eof only in the two following cases: 1) read() returns a zero length short read with no error. 2) the offset+length of the read is >= the file size. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
3c86794a |
|
04-Apr-2019 |
Murphy Zhou <jencce.kernel@gmail.com> |
nfsd/nfsd3_proc_readdir: fix buffer count and page pointers After this commit f875a79 nfsd: allow nfsv3 readdir request to be larger. nfsv3 readdir request size can be larger than PAGE_SIZE. So if the directory been read is large enough, we can use multiple pages in rq_respages. Update buffer count and page pointers like we do in readdirplus to make this happen. Now listing a directory within 3000 files will panic because we are counting in a wrong way and would write on random page. Fixes: f875a79 "nfsd: allow nfsv3 readdir request to be larger" Signed-off-by: Murphy Zhou <jencce.kernel@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f875a792 |
|
06-Mar-2019 |
NeilBrown <neilb@suse.com> |
nfsd: allow nfsv3 readdir request to be larger. nfsd currently reports the NFSv3 dtpref FSINFO parameter to be PAGE_SIZE, so NFS clients will typically ask for one page of directory entries at a time. This is needlessly restrictive as nfsd can handle larger replies easily. Also, a READDIR request (but not a READDIRPLUS request) has the count size clipped to PAGE_SIE, again unnecessary. This patch lifts these limits so that larger readdir requests can be used. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
b602345d |
|
03-Mar-2019 |
NeilBrown <neilb@suse.com> |
nfsd: fix memory corruption caused by readdir If the result of an NFSv3 readdir{,plus} request results in the "offset" on one entry having to be split across 2 pages, and is sized so that the next directory entry doesn't fit in the requested size, then memory corruption can happen. When encode_entry() is called after encoding the last entry that fits, it notices that ->offset and ->offset1 are set, and so stores the offset value in the two pages as required. It clears ->offset1 but *does not* clear ->offset. Normally this omission doesn't matter as encode_entry_baggage() will be called, and will set ->offset to a suitable value (not on a page boundary). But in the case where cd->buflen < elen and nfserr_toosmall is returned, ->offset is not reset. This means that nfsd3proc_readdirplus will see ->offset with a value 4 bytes before the end of a page, and ->offset1 set to NULL. It will try to write 8bytes to ->offset. If we are lucky, the next page will be read-only, and the system will BUG: unable to handle kernel paging request at... If we are unlucky, some innocent page will have the first 4 bytes corrupted. nfsd3proc_readdir() doesn't even check for ->offset1, it just blindly writes 8 bytes to the offset wherever it is. Fix this by clearing ->offset after it is used, and copying the ->offset handling code from nfsd3_proc_readdirplus into nfsd3_proc_readdir. (Note that the commit hash in the Fixes tag is from the 'history' tree - this bug predates git). Fixes: 0b1d57cf7654 ("[PATCH] kNFSd: Fix nfs3 dentry encoding") Fixes-URL: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=0b1d57cf7654 Cc: stable@vger.kernel.org (v2.6.12+) Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
11b4d66e |
|
27-Jul-2018 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Handle full-length symlinks I've given up on the idea of zero-copy handling of SYMLINK on the server side. This is because the Linux VFS symlink API requires the symlink pathname to be in a NUL-terminated kmalloc'd buffer. The NUL-termination is going to be problematic (watching out for landing on a page boundary and dealing with a 4096-byte pathname). I don't believe that SYMLINK creation is on a performance path or is requested frequently enough that it will cause noticeable CPU cache pollution due to data copies. There will be two places where a transport callout will be necessary to fill in the rqstp: one will be in the svc_fill_symlink_pathname() helper that is used by NFSv2 and NFSv3, and the other will be in nfsd4_decode_create(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
3fd9557a |
|
27-Jul-2018 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Refactor the generic write vector fill helper fill_in_write_vector() is nearly the same logic as svc_fill_write_vector(), but there are a few differences so that the former can handle multiple WRITE payloads in a single COMPOUND. svc_fill_write_vector() can be adjusted so that it can be used in the NFSv4 WRITE code path too. Instead of assuming the pages are coming from rq_args.pages, have the caller pass in the page list. The immediate benefit is a reduction of code duplication. It also prevents the NFSv4 WRITE decoder from passing an empty vector element when the transport has provided the payload in the xdr_buf's page array. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
38a70315 |
|
27-Mar-2018 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Clean up legacy NFS SYMLINK argument XDR decoders Move common code in NFSD's legacy SYMLINK decoders into a helper. The immediate benefits include: - one fewer data copies on transports that support DDP - consistent error checking across all versions - reduction of code duplication - support for both legal forms of SYMLINK requests on RDMA transports for all versions of NFS (in particular, NFSv2, for completeness) In the long term, this helper is an appropriate spot to perform a per-transport call-out to fill the pathname argument using, say, RDMA Reads. Filling the pathname in the proc function also means that eventually the incoming filehandle can be interpreted so that filesystem- specific memory can be allocated as a sink for the pathname argument, rather than using anonymous pages. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
8154ef27 |
|
27-Mar-2018 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Clean up legacy NFS WRITE argument XDR decoders Move common code in NFSD's legacy NFS WRITE decoders into a helper. The immediate benefit is reduction of code duplication and some nice micro-optimizations (see below). In the long term, this helper can perform a per-transport call-out to fill the rq_vec (say, using RDMA Reads). The legacy WRITE decoders and procs are changed to work like NFSv4, which constructs the rq_vec just before it is about to call vfs_writev. Why? Calling a transport call-out from the proc instead of the XDR decoder means that the incoming FH can be resolved to a particular filesystem and file. This would allow pages from the backing file to be presented to the transport to be filled, rather than presenting anonymous pages and copying or flipping them into the file's page cache later. I also prefer using the pages in rq_arg.pages, instead of pulling the data pages directly out of the rqstp::rq_pages array. This is currently the way the NFSv3 write decoder works, but the other two do not seem to take this approach. Fixing this removes the only reference to rq_pages found in NFSD, eliminating an NFSD assumption about how transports use the pages in rq_pages. Lastly, avoid setting up the first element of rq_vec as a zero- length buffer. This happens with an RDMA transport when a normal Read chunk is present because the data payload is in rq_arg's page list (none of it is in the head buffer). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
b2441318 |
|
01-Nov-2017 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
aa8217d5 |
|
12-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: mark all struct svc_version instances as const Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
b9c744c1 |
|
12-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: mark all struct svc_procinfo instances as const struct svc_procinfo contains function pointers, and marking it as constant avoids it being able to be used as an attach vector for code injections. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
0becc118 |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: move pc_count out of struct svc_procinfo pc_count is the only writeable memeber of struct svc_procinfo, which is a good candidate to be const-ified as it contains function pointers. This patch moves it into out out struct svc_procinfo, and into a separate writable array that is pointed to by struct svc_version. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
d16d1867 |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: properly type pc_encode callbacks Drop the resp argument as it can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to kxdrproc_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
cc6acc20 |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: properly type pc_decode callbacks Drop the argp argument as it can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to kxdrproc_t. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
1150ded8 |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: properly type pc_release callbacks Drop the p and resp arguments as they are always NULL or can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to kxdrproc_t. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
1c8a5409 |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: properly type pc_func callbacks Drop the argp and resp arguments as they can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to svc_procfunc as well as the svc_procfunc typedef itself. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
36ba89c2 |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
nfsd: remove the unused PROC() macro in nfs3proc.c Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
e9679189 |
|
12-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: mark all struct svc_version instances as const Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
860bda29 |
|
12-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: mark all struct svc_procinfo instances as const struct svc_procinfo contains function pointers, and marking it as constant avoids it being able to be used as an attach vector for code injections. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
7fd38af9 |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: move pc_count out of struct svc_procinfo pc_count is the only writeable memeber of struct svc_procinfo, which is a good candidate to be const-ified as it contains function pointers. This patch moves it into out out struct svc_procinfo, and into a separate writable array that is pointed to by struct svc_version. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
63f8de37 |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: properly type pc_encode callbacks Drop the resp argument as it can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to kxdrproc_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
026fec7e |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: properly type pc_decode callbacks Drop the argp argument as it can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to kxdrproc_t. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
8537488b |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: properly type pc_release callbacks Drop the p and resp arguments as they are always NULL or can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to kxdrproc_t. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
a6beb732 |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: properly type pc_func callbacks Drop the argp and resp arguments as they can trivially be derived from the rqstp argument. With that all functions now have the same prototype, and we can remove the unsafe casting to svc_procfunc as well as the svc_procfunc typedef itself. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
9482c9c1 |
|
08-May-2017 |
Christoph Hellwig <hch@lst.de> |
nfsd: remove the unused PROC() macro in nfs3proc.c Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
52e380e0 |
|
31-Dec-2016 |
Kinglong Mee <kinglongmee@gmail.com> |
NFSD: cleanup dead codes and values in nfsd_write This is just cleanup, no change in functionality. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
54bbb7d2 |
|
31-Dec-2016 |
Kinglong Mee <kinglongmee@gmail.com> |
NFSD: pass an integer for stable type to nfsd_vfs_write After fae5096ad217 "nfsd: assume writeable exportabled filesystems have f_sync" we no longer modify this argument. This is just cleanup, no change in functionality. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
fc64005c |
|
09-Apr-2016 |
Al Viro <viro@zeniv.linux.org.uk> |
don't bother with ->d_inode->i_sb - it's always equal to ->d_sb ... and neither can ever be NULL Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
ac503e4a |
|
22-Mar-2016 |
Benjamin Coddington <bcodding@redhat.com> |
nfsd: use short read as well as i_size to set eof Use the result of a local read to determine when to set the eof flag. This allows us to return the location of the end of the file atomically at the time of the read. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> [bfields: add some documentation] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
2b0143b5 |
|
17-Mar-2015 |
David Howells <dhowells@redhat.com> |
VFS: normal filesystems (and lustre): d_inode() annotations that's the bulk of filesystem drivers dealing with inodes of their own Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
18c01ab3 |
|
01-Aug-2014 |
Rajesh Ghanekar <Rajesh_Ghanekar@symantec.com> |
nfsd: allow turning off nfsv3 readdir_plus One of our customer's application only needs file names, not file attributes. With directories having 10K+ inodes (assuming buffer cache has directory blocks cached having file names, but inode cache is limited and hence need eviction of older cached inodes), older inodes are evicted periodically. So if they keep on doing readdir(2) from NSF client on multiple directories, some directory's files are periodically removed from inode cache and hence new readdir(2) on same directory requires disk access to bring back inodes again to inode cache. As READDIRPLUS request fetches attributes also, doing getattr on each file on server, it causes unnecessary disk accesses. If READDIRPLUS on NFS client is returned with -ENOTSUPP, NFS client uses READDIR request which just gets the names of the files in a directory, not attributes, hence avoiding disk accesses on server. There's already a corresponding client-side mount option, but an export option reduces the need for configuration across multiple clients. This flag affects NFSv3 only. If it turns out it's needed for NFSv4 as well then we may have to figure out how to extend the behavior to NFSv4, but it's not currently obvious how to do that. Signed-off-by: Rajesh Ghanekar <rajesh_ghanekar@symantec.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
63bab065 |
|
09-Aug-2014 |
Ross Lagerwall <rosslagerwall@gmail.com> |
nfsd3: Check write permission after checking existence When creating a file that already exists in a read-only directory with O_EXCL, the NFSv3 server returns EACCES rather than EEXIST (which local files and the NFSv4 server return). Fix this by checking the MAY_CREATE permission only if the file does not exist. Since this already happens in do_nfsd_create, the check in nfsd3_proc_create can simply be removed. Signed-off-by: Ross Lagerwall <rosslagerwall@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
1e444f5b |
|
01-Jul-2014 |
Kinglong Mee <kinglongmee@gmail.com> |
NFSD: Remove iattr parameter from nfsd_symlink() Commit db2e747b1499 (vfs: remove mode parameter from vfs_symlink()) have remove mode parameter from vfs_symlink. So that, iattr isn't needed by nfsd_symlink now, just remove it. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
52ee0433 |
|
20-Jun-2014 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: let nfsd_symlink assume null-terminated data Currently nfsd_symlink has a weird hack to serve callers who don't null-terminate symlink data: it looks ahead at the next byte to see if it's zero, and copies it to a new buffer to null-terminate if not. That means callers don't have to null-terminate, but they *do* have to ensure that the byte following the end of the data is theirs to read. That's a bit subtle, and the NFSv4 code actually got this wrong. So let's just throw out that code and let callers pass null-terminated strings; we've already fixed them to do that. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
3c7aa15d |
|
10-Jun-2014 |
Kinglong Mee <kinglongmee@gmail.com> |
NFSD: Using min/max/min_t/max_t for calculate Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
3dadecce |
|
24-Jan-2013 |
Al Viro <viro@zeniv.linux.org.uk> |
switch vfs_getattr() to struct path Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
afc59400 |
|
10-Dec-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: cleanup: replace rq_resused count by rq_next_page pointer It may be a matter of personal taste, but I find this makes the code clearer. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
95c7a20a |
|
27-Jul-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: do_nfsd_create verf argument is a u32 The types here are actually a bit of a mess. For now cast as we do in the v4 case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
ac6721a1 |
|
20-Apr-2011 |
Mi Jinlong <mijinlong@cn.fujitsu.com> |
nfsd41: make sure nfs server process OPEN with EXCLUSIVE4_1 correctly The NFS server uses nfsd_create_v3 to handle EXCLUSIVE4_1 opens, but that function is not prepared to handle them. Rename nfsd_create_v3() to do_nfsd_create(), and add handling of EXCLUSIVE4_1. Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
18b631f8 |
|
29-Nov-2010 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: fix offset printk's in nfsd3 read/write Thanks to dysbr01@ca.com for noticing that the debugging printk in the v3 write procedure can print >2GB offsets as negative numbers: https://bugzilla.kernel.org/show_bug.cgi?id=23342 Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
039a87ca |
|
30-Jul-2010 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: minor nfsd read api cleanup Christoph points that the NFSv2/v3 callers know which case they want here, so we may as well just call the file=NULL case directly instead of making this conditional. Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
43a9aa64 |
|
06-Jul-2010 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Fill in WCC data for REMOVE, RMDIR, MKNOD, and MKDIR Some well-known NFSv3 clients drop their directory entry caches when they receive replies with no WCC data. Without this data, they employ extra READ, LOOKUP, and GETATTR requests to ensure their directory entry caches are up to date, causing performance to suffer needlessly. In order to return WCC data, our server has to have both the pre-op and the post-op attribute data on hand when a reply is XDR encoded. The pre-op data is filled in when the incoming fh is locked, and the post-op data is filled in when the fh is unlocked. Unfortunately, for REMOVE, RMDIR, MKNOD, and MKDIR, the directory fh is not unlocked until well after the reply has been XDR encoded. This means that encode_wcc_data() does not have wcc_data for the parent directory, so none is returned to the client after these operations complete. By unlocking the parent directory fh immediately after the internal operations for each NFS procedure is complete, the post-op data is filled in before XDR encoding starts, so it can be returned to the client properly. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
7663dacd |
|
04-Dec-2009 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd: remove pointless paths in file headers The new .h files have paths at the top that are now out of date. While we're here, just remove all of those from fs/nfsd; they never served any purpose. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
9a74af21 |
|
03-Dec-2009 |
Boaz Harrosh <bharrosh@panasas.com> |
nfsd: Move private headers to source directory Lots of include/linux/nfsd/* headers are only used by nfsd module. Move them to the source directory Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
341eb184 |
|
03-Dec-2009 |
Boaz Harrosh <bharrosh@panasas.com> |
nfsd: Source files #include cleanups Now that the headers are fixed and carry their own wait, all fs/nfsd/ source files can include a minimal set of headers. and still compile just fine. This patch should improve the compilation speed of the nfsd module. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
0a3adade |
|
04-Nov-2009 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd: make fs/nfsd/vfs.h for common includes None of this stuff is used outside nfsd, so move it out of the common linux include directory. Actually, probably none of the stuff in include/linux/nfsd/nfsd.h really belongs there, so later we may remove that file entirely. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
b9081d90 |
|
09-Jun-2009 |
Yu Zhiguo <yuzg@cn.fujitsu.com> |
NFS: kill off complicated macro 'PROC' kill off obscure macro 'PROC' of NFSv2&3 in order to make the code more clear. Among other things, this makes it simpler to grep for callers of these functions--something which has frequently caused confusion among nfs developers. Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
31dec253 |
|
05-Mar-2009 |
David Shaw <dshaw@jabberwocky.com> |
Short write in nfsd becomes a full write to the client If a filesystem being written to via NFS returns a short write count (as opposed to an error) to nfsd, nfsd treats that as a success for the entire write, rather than the short count that actually succeeded. For example, given a 8192 byte write, if the underlying filesystem only writes 4096 bytes, nfsd will ack back to the nfs client that all 8192 bytes were written. The nfs client does have retry logic for short writes, but this is never called as the client is told the complete write succeeded. There are probably other ways it could happen, but in my case it happened with a fuse (filesystem in userspace) filesystem which can rather easily have a partial write. Here is a patch to properly return the short write count to the client. Signed-off-by: David Shaw <dshaw@jabberwocky.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
12214cb7 |
|
11-Jan-2009 |
Qinghuang Feng <qhfeng.kernel@gmail.com> |
NFSD: cleanup for nfs3proc.c MSDOS_SUPER_MAGIC is defined in <linux/magic.h>, so use MSDOS_SUPER_MAGIC directly. Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
04716e66 |
|
07-Aug-2008 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd: permit unauthenticated stat of export root RFC 2623 section 2.3.2 permits the server to bypass gss authentication checks for certain operations that a client may perform when mounting. In the case of a client that doesn't have some form of credentials available to it on boot, this allows it to perform the mount unattended. (Presumably real file access won't be needed until a user with credentials logs in.) Being slightly more lenient allows lots of old clients to access krb5-only exports, with the only loss being a small amount of information leaked about the root directory of the export. This affects only v2 and v3; v4 still requires authentication for all access. Thanks to Peter Staubach testing against a Solaris client, which suggesting addition of v3 getattr, to the list, and to Trond for noting that doing so exposes no additional information. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Peter Staubach <staubach@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
|
#
8837abca |
|
16-Jun-2008 |
Miklos Szeredi <mszeredi@suse.cz> |
nfsd: rename MAY_ flags Rename nfsd_permission() specific MAY_* flags to NFSD_MAY_* to make it clear, that these are not used outside nfsd, and to avoid name and number space conflicts with the VFS. [comment from hch: rename MAY_READ, MAY_WRITE and MAY_EXEC as well] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
54775491 |
|
14-Feb-2008 |
Jan Blunck <jblunck@suse.de> |
Use struct path in struct svc_export I'm embedding struct path into struct svc_export. [akpm@linux-foundation.org: coding-style fixes] [ezk@cs.sunysb.edu: NFSD: fix wrong mnt_writer count in rename] Signed-off-by: Jan Blunck <jblunck@suse.de> Acked-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
cd123012 |
|
09-May-2007 |
Jeff Layton <jlayton@kernel.org> |
RPC: add wrapper for svc_reserve to account for checksum When the kernel calls svc_reserve to downsize the expected size of an RPC reply, it fails to account for the possibility of a checksum at the end of the packet. If a client mounts a NFSv2/3 with sec=krb5i/p, and does I/O then you'll generally see messages similar to this in the server's ring buffer: RPC request reserved 164 but used 208 While I was never able to verify it, I suspect that this problem is also the root cause of some oopses I've seen under these conditions: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227726 This is probably also a problem for other sec= types and for NFSv4. The large reserved size for NFSv4 compound packets seems to generally paper over the problem, however. This patch adds a wrapper for svc_reserve that accounts for the possibility of a checksum. It also fixes up the appropriate callers of svc_reserve to call the wrapper. For now, it just uses a hardcoded value that I determined via testing. That value may need to be revised upward as things change, or we may want to eventually add a new auth_op that attempts to calculate this somehow. Unfortunately, there doesn't seem to be a good way to reliably determine the expected checksum length prior to actually calculating it, particularly with schemes like spkm3. Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
81ac95c5 |
|
08-Nov-2006 |
J. Bruce Fields <bfields@fieldses.org> |
[PATCH] nfsd4: fix open-create permissions In the case where an open creates the file, we shouldn't be rechecking permissions to open the file; the open succeeds regardless of what the new file's mode bits say. This patch fixes the problem, but only by introducing yet another parameter to nfsd_create_v3. This is ugly. This will be fixed by later patches. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Acked-by: Neil Brown <neilb@suse.de> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
c4d987ba |
|
20-Oct-2006 |
Al Viro <viro@ftp.linux.org.uk> |
[PATCH] nfsd: NFSv{2,3} trivial endianness annotations for error values Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
7111c66e |
|
20-Oct-2006 |
Al Viro <viro@ftp.linux.org.uk> |
[PATCH] fix svc_procfunc declaration svc_procfunc instances return __be32, not int Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
7adae489 |
|
04-Oct-2006 |
Greg Banks <gnb@melbourne.sgi.com> |
[PATCH] knfsd: Prepare knfsd for support of rsize/wsize of up to 1MB, over TCP The limit over UDP remains at 32K. Also, make some of the apparently arbitrary sizing constants clearer. The biggest change here involves replacing NFSSVC_MAXBLKSIZE by a function of the rqstp. This allows it to be different for different protocols (udp/tcp) and also allows it to depend on the servers declared sv_bufsiz. Note that we don't actually increase sv_bufsz for nfs yet. That comes next. Signed-off-by: Greg Banks <gnb@melbourne.sgi.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
3cc03b16 |
|
04-Oct-2006 |
NeilBrown <neilb@suse.de> |
[PATCH] knfsd: Avoid excess stack usage in svc_tcp_recvfrom .. by allocating the array of 'kvec' in 'struct svc_rqst'. As we plan to increase RPCSVC_MAXPAGES from 8 upto 256, we can no longer allocate an array of this size on the stack. So we allocate it in 'struct svc_rqst'. However svc_rqst contains (indirectly) an array of the same type and size (actually several, but they are in a union). So rather than waste space, we move those arrays out of the separately allocated union and into svc_rqst to share with the kvec moved out of svc_tcp_recvfrom (various arrays are used at different times, so there is no conflict). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
7775f4c8 |
|
10-Apr-2006 |
NeilBrown <neilb@suse.de> |
[PATCH] knfsd: Correct reserved reply space for read requests. NFSd makes sure there is enough space to hold the maximum possible reply before accepting a request. The units for this maximum is (4byte) words. However in three places, particularly for read request, the number given is a number of bytes. This means too much space is reserved which is slightly wasteful. This is the sort of patch that could uncover a deeper bug, and it is not critical, so it would be best for it to spend a while in -mm before going in to mainline. (akpm: target 2.6.17-rc2, 2.6.16.3 (approx)) Discovered-by: "Eivind Sarto" <ivan@kasenna.com> Signed-off-by: Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
a334de28 |
|
06-Jan-2006 |
David Shaw <dshaw@jabberwocky.com> |
[PATCH] knfsd: check error status from vfs_getattr and i_op->fsync Both vfs_getattr and i_op->fsync return error statuses which nfsd was largely ignoring. This as noticed when exporting directories using fuse. This patch cleans up most of the offences, which involves moving the call to vfs_getattr out of the xdr encoding routines (where it is too late to report an error) into the main NFS procedure handling routines. There is still a called to vfs_gettattr (related to the ACL code) where the status is ignored, and called to nfsd_sync_dir don't check return status either. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
1da177e4 |
|
16-Apr-2005 |
Linus Torvalds <torvalds@ppc970.osdl.org> |
Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
|