#
1548036e |
|
15-Feb-2024 |
Josef Bacik <josef@toxicpanda.com> |
nfs: make the rpc_stat per net namespace Now that we're exposing the rpc stats on a per-network namespace basis, move this struct into struct nfs_net and use that to make sure only the per-network namespace stats are exposed. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
1fd5394e |
|
17-Nov-2023 |
Benjamin Coddington <bcodding@redhat.com> |
NFS: drop unused nfs_direct_req bytes_left Now that we're calculating how large a remaining IO should be based on the current request's offset, we no longer need to track bytes_left on each struct nfs_direct_req. Drop the field, and clean up the direct request tracepoints. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
8a6291bf |
|
17-Nov-2023 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
pNFS: Fix the pnfs block driver's calculation of layoutget size Instead of relying on the value of the 'bytes_left' field, we should calculate the layout size based on the offset of the request that is being written out. Reported-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Fixes: 954998b60caa ("NFS: Fix error handling for O_DIRECT write scheduling") Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
537935f7 |
|
19-Aug-2023 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS/pNFS: Set the connect timeout for the pNFS flexfiles driver Ensure that the connect timeout for the pNFS flexfiles driver is of the same order as the I/O timeout, so that we can fail over quickly when trying to read from a data server that is down. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
303a7805 |
|
09-Jun-2023 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFSv4.2: Rework scratch handling for READ_PLUS (again) I found that the read code might send multiple requests using the same nfs_pgio_header, but nfs4_proc_read_setup() is only called once. This is how we ended up occasionally double-freeing the scratch buffer, but also means we set a NULL pointer but non-zero length to the xdr scratch buffer. This results in an oops the first time decoding needs to copy something to scratch, which frequently happens when decoding READ_PLUS hole segments. I fix this by moving scratch handling into the pageio read code. I provide a function to allocate scratch space for decoding read replies, and free the scratch buffer when the nfs_pgio_header is freed. Fixes: fbd2a05f29a9 (NFSv4.2: Rework scratch handling for READ_PLUS) Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
c8407f2e |
|
07-Jun-2023 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Add an "xprtsec=" NFS mount option After some discussion, we decided that controlling transport layer security policy should be separate from the setting for the user authentication flavor. To accomplish this, add a new NFS mount option to select a transport layer security policy for RPC operations associated with the mount point. xprtsec=none - Transport layer security is forced off. xprtsec=tls - Establish an encryption-only TLS session. If the initial handshake fails, the mount fails. If TLS is not available on a reconnect, drop the connection and try again. xprtsec=mtls - Both sides authenticate and an encrypted session is created. If the initial handshake fails, the mount fails. If TLS is not available on a reconnect, drop the connection and try again. To support client peer authentication (mtls), the handshake daemon will have configurable default authentication material (certificate or pre-shared key). In the future, mount options can be added that can provide this material on a per-mount basis. Updates to mount.nfs (to support xprtsec=auto) and nfs(5) will be sent under separate cover. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
6c0a8c5f |
|
07-Jun-2023 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Have struct nfs_client carry a TLS policy field The new field is used to match struct nfs_clients that have the same TLS policy setting. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
a7db5034 |
|
22-May-2023 |
David Howells <dhowells@redhat.com> |
nfs: Provide a splice-read wrapper Provide a splice_read wrapper for NFS. This locks the inode around filemap_splice_read() and revalidates the mapping. Splicing from direct I/O is handled by the caller. Signed-off-by: David Howells <dhowells@redhat.com> cc: Christoph Hellwig <hch@lst.de> cc: Al Viro <viro@zeniv.linux.org.uk> cc: Jens Axboe <axboe@kernel.dk> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: Anna Schumaker <anna@kernel.org> cc: linux-nfs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-block@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20230522135018.2742245-21-dhowells@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
#
e59fb674 |
|
03-Mar-2023 |
Jeff Layton <jlayton@kernel.org> |
nfs: move nfs_fhandle_hash to common include file lockd needs to be able to hash filehandles for tracepoints. Move the nfs_fhandle_hash() helper to a common nfs include file. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
000dbe0b |
|
20-Feb-2023 |
Dave Wysochanski <dwysocha@redhat.com> |
NFS: Convert buffered read paths to use netfs when fscache is enabled Convert the NFS buffered read code paths to corresponding netfs APIs, but only when fscache is configured and enabled. The netfs API defines struct netfs_request_ops which must be filled in by the network filesystem. For NFS, we only need to define 5 of the functions, the main one being the issue_read() function. The issue_read() function is called by the netfs layer when a read cannot be fulfilled locally, and must be sent to the server (either the cache is not active, or it is active but the data is not available). Once the read from the server is complete, netfs requires a call to netfs_subreq_terminated() which conveys either how many bytes were read successfully, or an error. Note that issue_read() is called with a structure, netfs_io_subrequest, which defines the IO requested, and contains a start and a length (both in bytes), and assumes the underlying netfs will return a either an error on the whole region, or the number of bytes successfully read. The NFS IO path is page based and the main APIs are the pgio APIs defined in pagelist.c. For the pgio APIs, there is no way for the caller to know how many RPCs will be sent and how the pages will be broken up into underlying RPCs, each of which will have their own completion and return code. In contrast, netfs is subrequest based, a single subrequest may contain multiple pages, and a single subrequest is initiated with issue_read() and terminated with netfs_subreq_terminated(). Thus, to utilze the netfs APIs, NFS needs some way to accommodate the netfs API requirement on the single response to the whole subrequest, while also minimizing disruptive changes to the NFS pgio layer. The approach taken with this patch is to allocate a small structure for each nfs_netfs_issue_read() call, store the final error and number of bytes successfully transferred in the structure, and update these values as each RPC completes. The refcount on the structure is used as a marker for the last RPC completion, is incremented in nfs_netfs_read_initiate(), and decremented inside nfs_netfs_read_completion(), when a nfs_pgio_header contains a valid pointer to the data. On the final put (which signals the final outstanding RPC is complete) in nfs_netfs_read_completion(), call netfs_subreq_terminated() with either the final error value (if one or more READs complete with an error) or the number of bytes successfully transferred (if all RPCs complete successfully). Note that when all RPCs complete successfully, the number of bytes transferred is capped to the length of the subrequest. Capping the transferred length to the subrequest length prevents "Subreq overread" warnings from netfs. This is due to the "aligned_len" in nfs_pageio_add_page(), and the corner case where NFS requests a full page at the end of the file, even when i_size reflects only a partial page (NFS overread). Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Tested-by: Daire Byrne <daire@dneg.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
0c493b5c |
|
19-Jan-2023 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Convert buffered writes to use folios Mostly mechanical conversion of struct page and functions into struct folio equivalents. The lack of support for folios in write_cache_pages(), means we still only support order 0 folio allocations. However the rest of the writeback code should now be ready for order n > 0. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
ab75bff1 |
|
19-Jan-2023 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Convert buffered reads to use folios Perform a largely mechanical conversion of references to struct page and page-specific functions to use the folio equivalents. Note that the fscache functionality remains untouched. Instead we just pass in the folio page. This should be OK, as long as we use order 0 folios together with fscache. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
eb9f2a5a |
|
19-Jan-2023 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Support folios in nfs_generic_pgio() Add support for multi-page folios in the generic NFS i/o engine. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
e18275ae |
|
12-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port ->rename() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
#
5ebb29be |
|
12-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port ->mknod() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
#
c54bd91e |
|
12-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port ->mkdir() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
#
7a77db95 |
|
12-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port ->symlink() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
#
6c960e68 |
|
12-Jan-2023 |
Christian Brauner <brauner@kernel.org> |
fs: port ->create() to pass mnt_idmap Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
#
a60214c2 |
|
30-Nov-2022 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFS: Allow very small rsize & wsize again 940261a19508 introduced nfs_io_size() to clamp the iosize to a multiple of PAGE_SIZE. This had the unintended side effect of no longer allowing iosizes less than a page, which could be useful in some situations. UDP already has an exception that causes it to fall back on the power-of-two style sizes instead. This patch adds an additional exception for very small iosizes. Reported-by: Jeff Layton <jlayton@kernel.org> Fixes: 940261a19508 ("NFS: Allow setting rsize / wsize to a multiple of PAGE_SIZE") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
cf0d7e7f |
|
16-Oct-2022 |
Kees Cook <keescook@chromium.org> |
NFS: Avoid memcpy() run-time warning for struct sockaddr overflows The 'nfs_server' and 'mount_server' structures include a union of 'struct sockaddr' (with the older 16 bytes max address size) and 'struct sockaddr_storage' which is large enough to hold all the supported sa_family types (128 bytes max size). The runtime memcpy() buffer overflow checker is seeing attempts to write beyond the 16 bytes as an overflow, but the actual expected size is that of 'struct sockaddr_storage'. Plumb the use of 'struct sockaddr_storage' more completely through-out NFS, which results in adjusting the memcpy() buffers to the correct union members. Avoids this false positive run-time warning under CONFIG_FORTIFY_SOURCE: memcpy: detected field-spanning write (size 28) of single field "&ctx->nfs_server.address" at fs/nfs/namespace.c:178 (size 16) Reported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/all/202210110948.26b43120-yujie.liu@intel.com Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna@kernel.org> Cc: linux-nfs@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
a035618c |
|
09-Sep-2022 |
Gaosheng Cui <cuigaosheng1@huawei.com> |
nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration nfs_write_prepare() has been removed since commit a4cdda59111f ("NFS: Create a common pgio_rpc_prepare function"), so remove it. nfs_wait_atomic_killable() has been removed since commit 723c921e7dfc ("sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API"), so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
d7a51186 |
|
07-Sep-2022 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFSv4.2: Update mode bits after ALLOCATE and DEALLOCATE The fallocate call invalidates suid and sgid bits as part of normal operation. We need to mark the mode bits as invalid when using fallocate with an suid so these will be updated the next time the user looks at them. This fixes xfstests generic/683 and generic/684. Reported-by: Yue Cui <cuiyue-fnst@fujitsu.com> Fixes: 913eca1aea87 ("NFS: Fallocate should use the nfs4_fattr_bitmap") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
8efc4bbe |
|
22-Jul-2022 |
Jeff Layton <jlayton@kernel.org> |
nfs: add new nfs_direct_req tracepoint events Add some new tracepoints to the DIO write code. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
940261a1 |
|
17-Jun-2022 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFS: Allow setting rsize / wsize to a multiple of PAGE_SIZE Previously, we required this to value to be a power of 2 for UDP related reasons. This patch keeps the power of 2 rule for UDP but allows more flexibility for TCP and RDMA. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
4ae84a80 |
|
06-Jun-2022 |
Matthew Wilcox (Oracle) <willy@infradead.org> |
nfs: Convert to migrate_folio Use a folio throughout this function. migrate_page() will be converted later. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
45228440 |
|
14-May-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Memory allocation failures are not server fatal errors We need to filter out ENOMEM in nfs_error_is_fatal_on_server(), because running out of memory on our client is not a server error. Reported-by: Olga Kornievskaia <aglo@umich.edu> Fixes: 2dc23afffbca ("NFS: ENOMEM should also be a fatal error.") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
b243874f |
|
29-Mar-2022 |
ChenXiaoSong <chenxiaosong2@huawei.com> |
NFSv4: fix open failure with O_ACCMODE flag open() with O_ACCMODE|O_DIRECT flags secondly will fail. Reproducer: 1. mount -t nfs -o vers=4.2 $server_ip:/ /mnt/ 2. fd = open("/mnt/file", O_ACCMODE|O_DIRECT|O_CREAT) 3. close(fd) 4. fd = open("/mnt/file", O_ACCMODE|O_DIRECT) Server nfsd4_decode_share_access() will fail with error nfserr_bad_xdr when client use incorrect share access mode of 0. Fix this by using NFS4_SHARE_ACCESS_BOTH share access mode in client, just like firstly opening. Fixes: ce4ef7c0a8a05 ("NFS: Split out NFS v4 file operations") Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
515dcdcd |
|
20-Mar-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: nfsiod should not block forever in mempool_alloc() The concern is that since nfsiod is sometimes required to kick off a commit, it can get locked up waiting forever in mempool_alloc() instead of failing gracefully and leaving the commit until later. Try to allocate from the slab first, with GFP_KERNEL | __GFP_NORETRY, then fall back to a non-blocking attempt to allocate from the memory pool. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
230bc98f |
|
17-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Improve heuristic for readdirplus The heuristic for readdirplus is designed to try to detect 'ls -l' and similar patterns. It does so by looking for cache hit/miss patterns in both the attribute cache and in the dcache of the files in a given directory, and then sets a flag for the readdirplus code to interpret. The problem with this approach is that a single attribute or dcache miss can cause the NFS code to force a refresh of the attributes for the entire set of files contained in the directory. To be able to make a more nuanced decision, let's sample the number of hits and misses in the set of open directory descriptors. That allows us to set thresholds at which we start preferring READDIRPLUS over regular READDIR, or at which we start to force a re-read of the remaining readdir cache using READDIRPLUS. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
84631f84 |
|
23-Feb-2022 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Clean up NFSv4.2 xattrs Add a helper for the xattr mask so that we can get rid of the inlined ifdefs. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
00bdadc7 |
|
17-Dec-2021 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Add a helper to remove case-insensitive aliases When dealing with case insensitive names, the client has no idea how the server performs the mapping, so cannot collapse the dentries into a single representative. So both rename and unlink need to deal with the fact that there could be several dentries representing the file, and have to somehow force them to be revalidated. Use d_prune_aliases() as a big hammer approach. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
d755ad8d |
|
22-Oct-2021 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFS: Create a new nfs_alloc_fattr_with_label() function For creating fattrs with the label field already allocated for us. I also update nfs_free_fattr() to free the label in the end. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
5fe1210d |
|
14-Oct-2021 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFS: Unexport nfs_probe_fsinfo() All the callers are now in client.c so we can remove the EXPORT_SYMBOL_GPL() and make it static. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
e5731131 |
|
14-Oct-2021 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFS: Move nfs_probe_destination() into the generic client And rename it to nfs_probe_server(). I also change it to take the nfs_fh as an argument so callers can choose what filehandle to probe. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
01dde76e |
|
14-Oct-2021 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFS: Create an nfs4_server_set_init_caps() function And call it before doing an FSINFO probe to reset to the baseline capabilities before probing. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
7e134205 |
|
27-Aug-2021 |
Olga Kornievskaia <kolga@netapp.com> |
NFSv4 introduce max_connect mount options This option will control up to how many xprts can the client establish to the server with a distinct address (that means nconnect connections are not counted towards this new limit). This patch is setting up nfs structures to keeep track of the max_connect limit (does not enforce it). The default value is kept at 1 so that no current mounts that don't want any additional connections would be effected. The maximum value is set at 16. Mounts to DS are not limited to default value of 1 but instead set to the maximum default value of 16 (NFS_MAX_TRANSPORTS). Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
c9301cb3 |
|
22-Mar-2021 |
Eryu Guan <eguan@linux.alibaba.com> |
nfs: hornor timeo and retrans option when mounting NFSv3 Mounting NFSv3 uses default timeout parameters specified by underlying sunrpc transport, and mount options like 'timeo' and 'retrans', unlike NFSv4, are not honored. But sometimes we want to set non-default timeout value when mounting NFSv3, so pass 'timeo' and 'retrans' to nfs_mount() and fill the 'timeout' field of struct rpc_create_args before creating RPC connection. This is also consistent with NFSv4 behavior. Note that this only sets the timeout value of rpc connection to mountd, but the timeout of rpcbind connection should be set as well. A later patch will fix the rpcbind part. Signed-off-by: Eryu Guan <eguan@linux.alibaba.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
ec1ade6a |
|
19-Feb-2021 |
Olga Kornievskaia <kolga@netapp.com> |
nfs: account for selinux security context when deciding to share superblock Keep track of whether or not there were LSM security context options passed during mount (ie creation of the superblock). Then, while deciding if the superblock can be shared for the new mount, check if the newly passed in LSM security context options are compatible with the existing superblock's ones by calling security_sb_mnt_opts_compat(). Previously, with selinux enabled, NFS wasn't able to do the following 2mounts: mount -o vers=4.2,sec=sys,context=system_u:object_r:root_t:s0 <serverip>:/ /mnt mount -o vers=4.2,sec=sys,context=system_u:object_r:swapfile_t:s0 <serverip>:/scratch /scratch 2nd mount would fail with "mount.nfs: an incorrect mount option was specified" and var log messages would have: "SElinux: mount invalid. Same superblock, different security settings for.." Signed-off-by: Olga Kornievskaia <kolga@netapp.com> [PM: tweak subject line] Signed-off-by: Paul Moore <paul@paul-moore.com>
|
#
fd6d3fee |
|
08-Mar-2021 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Clean up function nfs_mark_dir_for_revalidate() Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
549c7297 |
|
21-Jan-2021 |
Christian Brauner <christian.brauner@ubuntu.com> |
fs: make helpers idmap mount aware Extend some inode methods with an additional user namespace argument. A filesystem that is aware of idmapped mounts will receive the user namespace the mount has been marked with. This can be used for additional permission checking and also to enable filesystems to translate between uids and gids if they need to. We have implemented all relevant helpers in earlier patches. As requested we simply extend the exisiting inode method instead of introducing new ones. This is a little more code churn but it's mostly mechanical and doesnt't leave us with additional inode methods. Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com Cc: Christoph Hellwig <hch@lst.de> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
#
896567ee |
|
10-Jan-2021 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: nfs_igrab_and_active must first reference the superblock Before referencing the inode, we must ensure that the superblock can be referenced. Otherwise, we can end up with iput() calling superblock operations that are no longer valid or accessible. Fixes: ea7c38fef0b7 ("NFSv4: Ensure we reference the inode for return-on-close in delegreturn") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
c98e9daa |
|
05-Jan-2021 |
Scott Mayhew <smayhew@redhat.com> |
NFS: Adjust fs_context error logging Several existing dprink()/dfprintk() calls were converted to use the new mount API logging macros by commit ce8866f0913f ("NFS: Attach supplementary error information to fs_context"). If the fs_context was not created using fsopen() then it will not have had a log buffer allocated for it, and the new mount API logging macros will wind up calling printk(). This can result in syslog messages being logged where previously there were none... most notably "NFS4: Couldn't follow remote path", which can happen if the client is auto-negotiating a protocol version with an NFS server that doesn't support the higher v4.x versions. Convert the nfs_errorf(), nfs_invalf(), and nfs_warnf() macros to check for the existence of the fs_context's log buffer and call dprintk() if it doesn't exist. Add nfs_ferrorf(), nfs_finvalf(), and nfs_warnf(), which do the same thing but take an NFS debug flag as an argument and call dfprintk(). Finally, modify the "NFS4: Couldn't follow remote path" message to use nfs_ferrorf(). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207385 Signed-off-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Fixes: ce8866f0913f ("NFS: Attach supplementary error information to fs_context.") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
1a34c8c9 |
|
01-Nov-2020 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Support larger readdir buffers Support readdir buffers of up to 1MB in size so that we can read large directories using few RPC calls. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Dave Wysochanski <dwysocha@redhat.com>
|
#
8d92890b |
|
01-Jun-2020 |
NeilBrown <neilb@suse.de> |
mm/writeback: discard NR_UNSTABLE_NFS, use NR_WRITEBACK instead After an NFS page has been written it is considered "unstable" until a COMMIT request succeeds. If the COMMIT fails, the page will be re-written. These "unstable" pages are currently accounted as "reclaimable", either in WB_RECLAIMABLE, or in NR_UNSTABLE_NFS which is included in a 'reclaimable' count. This might have made sense when sending the COMMIT required a separate action by the VFS/MM (e.g. releasepage() used to send a COMMIT). However now that all writes generated by ->writepages() will automatically be followed by a COMMIT (since commit 919e3bd9a875 ("NFS: Ensure we commit after writeback is complete")) it makes more sense to treat them as writeback pages. So this patch removes NR_UNSTABLE_NFS and accounts unstable pages in NR_WRITEBACK and WB_WRITEBACK. A particular effect of this change is that when wb_check_background_flush() calls wb_over_bg_threshold(), the latter will report 'true' a lot less often as the 'unstable' pages are no longer considered 'dirty' (as there is nothing that writeback can do about them anyway). Currently wb_check_background_flush() will trigger writeback to NFS even when there are relatively few dirty pages (if there are lots of unstable pages), this can result in small writes going to the server (10s of Kilobytes rather than a Megabyte) which hurts throughput. With this patch, there are fewer writes which are each larger on average. Where the NR_UNSTABLE_NFS count was included in statistics virtual-files, the entry is retained, but the value is hard-coded as zero. static trace points and warning printks which mentioned this counter no longer report it. [akpm@linux-foundation.org: re-layout comment] [akpm@linux-foundation.org: fix printk warning] Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@hammerspace.com> Acked-by: Michal Hocko <mhocko@suse.com> [mm] Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Link: http://lkml.kernel.org/r/87d06j7gqa.fsf@notabene.neil.brown.name Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
377840ee |
|
29-Mar-2020 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Remove the redundant function nfs_pgio_has_mirroring() We need to trust that desc->pg_mirror_idx is set correctly, whether or not mirroring is enabled. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
0aa647b7 |
|
21-Mar-2020 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Remove bucket array from struct pnfs_ds_commit_info Remove the unused bucket array in struct pnfs_ds_commit_info. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
a9901899 |
|
20-Mar-2020 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
pNFS: Add infrastructure for cleaning up per-layout commit structures Ensure that both the file and flexfiles layout types clean up when freeing the layout segments. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
e3b9f7e6 |
|
20-Mar-2020 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS/pNFS: Support commit arrays in nfs_clear_pnfs_ds_commit_verifiers() Add support for scanning the full list of per-layout segment commit arrays to nfs_clear_pnfs_ds_commit_verifiers(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
1f28476d |
|
21-Mar-2020 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Fix O_DIRECT commit verifier handling Instead of trying to save the commit verifiers and checking them against previous writes, adopt the same strategy as for buffered writes, of just checking the verifiers at commit time. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
3c9e502b |
|
26-Feb-2020 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Add a helper nfs_client_for_each_server() Add a helper nfs_client_for_each_server() to iterate through all the filesystems that are attached to a struct nfs_client, and apply a function to all the active ones. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
f7b37b8b |
|
13-Jan-2020 |
Trond Myklebust <trondmy@gmail.com> |
NFS: Add softreval behaviour to nfs_lookup_revalidate() If the server is unavaliable, we want to allow the revalidating lookup to time out, and to default to validating the cached dentry if the 'softreval' mount option is set. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
ae08483c |
|
11-Nov-2019 |
Arnd Bergmann <arnd@arndb.de> |
nfs: use timespec64 in nfs_fattr Push down the use of timespec64 into NFS nfs_fattr, to avoid needless conversions, and get closer to having 64-bit time_t support on 32-bit NFSv4 and removing some old interfaces from the kernel. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
ce8866f0 |
|
10-Dec-2019 |
Scott Mayhew <smayhew@redhat.com> |
NFS: Attach supplementary error information to fs_context. Split out from commit "NFS: Add fs_context support." Add wrappers nfs_errorf(), nfs_invalf(), and nfs_warnf() which log error information to the fs_context. Convert some printk's to use these new wrappers instead. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
62a55d08 |
|
10-Dec-2019 |
Scott Mayhew <smayhew@redhat.com> |
NFS: Additional refactoring for fs_context conversion Split out from commit "NFS: Add fs_context support." This patch adds additional refactoring for the conversion of NFS to use fs_context, namely: (*) Merge nfs_mount_info and nfs_clone_mount into nfs_fs_context. nfs_clone_mount has had several fields removed, and nfs_mount_info has been removed altogether. (*) Various functions now take an fs_context as an argument instead of nfs_mount_info, nfs_fs_context, etc. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
f2aedb71 |
|
10-Dec-2019 |
David Howells <dhowells@redhat.com> |
NFS: Add fs_context support. Add filesystem context support to NFS, parsing the options in advance and attaching the information to struct nfs_fs_context. The highlights are: (*) Merge nfs_mount_info and nfs_clone_mount into nfs_fs_context. This structure represents NFS's superblock config. (*) Make use of the VFS's parsing support to split comma-separated lists (*) Pin the NFS protocol module in the nfs_fs_context. (*) Attach supplementary error information to fs_context. This has the downside that these strings must be static and can't be formatted. (*) Remove the auxiliary file_system_type structs since the information necessary can be conveyed in the nfs_fs_context struct instead. (*) Root mounts are made by duplicating the config for the requested mount so as to have the same parameters. Submounts pick up their parameters from the parent superblock. [AV -- retrans is u32, not string] [SM -- Renamed cfg to ctx in a few functions in an earlier patch] [SM -- Moved fs_context mount option parsing to an earlier patch] [SM -- Moved fs_context error logging to a later patch] [SM -- Fixed printks in nfs4_try_get_tree() and nfs4_get_referral_tree()] [SM -- Added is_remount_fc() helper] [SM -- Deferred some refactoring to a later patch] [SM -- Fixed referral mounts, which were broken in the original patch] [SM -- Fixed leak of nfs_fattr when fs_context is freed] Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
e558100f |
|
10-Dec-2019 |
David Howells <dhowells@redhat.com> |
NFS: Do some tidying of the parsing code Do some tidying of the parsing code, including: (*) Returning 0/error rather than true/false. (*) Putting the nfs_fs_context pointer first in some arg lists. (*) Unwrap some lines that will now fit on one line. (*) Provide unioned sockaddr/sockaddr_storage fields to avoid casts. (*) nfs_parse_devname() can paste its return values directly into the nfs_fs_context struct as that's where the caller puts them. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
48be8a66 |
|
10-Dec-2019 |
David Howells <dhowells@redhat.com> |
NFS: Add a small buffer in nfs_fs_context to avoid string dup Add a small buffer in nfs_fs_context to avoid string duplication when parsing numbers. Also make the parsing function wrapper place the parsed integer directly in the appropriate nfs_fs_context struct member. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
f8ee01e3 |
|
10-Dec-2019 |
David Howells <dhowells@redhat.com> |
NFS: Split nfs_parse_mount_options() Split nfs_parse_mount_options() to move the prologue, list-splitting and epilogue into one function and the per-option processing into another. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
5eb005ca |
|
10-Dec-2019 |
David Howells <dhowells@redhat.com> |
NFS: Rename struct nfs_parsed_mount_data to struct nfs_fs_context Rename struct nfs_parsed_mount_data to struct nfs_fs_context and rename pointers to it to "ctx". At some point this will be pointed to by an fs_context struct's fs_private pointer. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
9954bf92 |
|
10-Dec-2019 |
David Howells <dhowells@redhat.com> |
NFS: Move mount parameterisation bits into their own file Split various bits relating to mount parameterisation out from fs/nfs/super.c into their own file to form the basis of filesystem context handling for NFS. No other changes are made to the code beyond removing 'static' qualifiers. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
adf2314f |
|
10-Dec-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: get rid of ->set_security() it's always either nfs_set_sb_security() or nfs_clone_sb_security(), the choice being controlled by mount_info->cloned != NULL. No need to add methods, especially when both instances live right next to the caller and are never accessed anywhere else. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
ab88dca3 |
|
10-Dec-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: get rid of mount_info ->fill_super() The only possible values are nfs_fill_super and nfs_clone_super. The latter is used only when crossing into a submount and it is almost identical to the former; the only differences are * ->s_time_gran unconditionally set to 1 (even for v2 mounts). Regression dating back to 2012, actually. * ->s_blocksize/->s_blocksize_bits set to that of parent. Rather than messing with the method, stash ->s_blocksize_bits in mount_info in submount case and after the (now unconditional) call of nfs_fill_super() override ->s_blocksize/->s_blocksize_bits if that has been set. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
0c38f213 |
|
10-Dec-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: don't pass nfs_subversion to ->create_server() pick it from mount_info Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
1bc3a2cb |
|
10-Dec-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: unexport nfs_fs_mount_common() Make it static, even. And remove a stale extern of (long-gone) nfs_xdev_mount_common() from internal.h, while we are at it. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
82eaed2b |
|
10-Dec-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: merge xdev and remote file_system_type they are identical now... Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
a55d3297 |
|
10-Dec-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: don't bother passing nfs_subversion to ->try_mount() and nfs_fs_mount_common() Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
6a3f7a39 |
|
10-Dec-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: stash nfs_subversion reference into nfs_mount_info That will allow to get rid of passing those references around in quite a few places. Moreover, that will allow to merge xdev and remote file_system_type. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
250d69f6 |
|
10-Dec-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: lift setting mount_info from nfs_xdev_mount() Do it in nfs_do_submount() instead. As a side benefit, nfs_clone_data doesn't need ->fh and ->fattr anymore. Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
d0b779d4 |
|
10-Dec-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: stash server into struct nfs_mount_info Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
e86d5a02 |
|
04-Oct-2019 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Convert struct nfs_fattr to use struct timespec64 NFSv4 supports 64-bit times, so we should switch to using struct timespec64 when decoding attributes. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
c128e575 |
|
22-Sep-2019 |
Trond Myklebust <trondmy@gmail.com> |
NFS: Optimise the default readahead size In the years since the max readahead size was fixed in NFS, a number of things have happened: - Users can now set the value directly using /sys/class/bdi - NFS max supported block sizes have increased by several orders of magnitude from 64K to 1MB. - Disk access latencies are orders of magnitude faster due to SSD + NVME. In particular note that if the server is advertising 1MB as the optimal read size, as that will set the readahead size to 15MB. Let's therefore adjust down, and try to default to VM_READAHEAD_PAGES. However let's inform the VM about our preferred block size so that it can choose to round up in cases where that makes sense. Reported-by: Alkis Georgopoulos <alkisg@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
8f54c7a4 |
|
14-Aug-2019 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Fix spurious EIO read errors If the client attempts to read a page, but the read fails due to some spurious error (e.g. an ACCESS error or a timeout, ...) then we need to allow other processes to retry. Also try to report errors correctly when doing a synchronous readpage. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
db531db9 |
|
12-Jul-2019 |
Max Kellermann <mk@cm4all.com> |
Revert "NFS: readdirplus optimization by cache mechanism" (memleak) This reverts commit be4c2d4723a4a637f0d1b4f7c66447141a4b3564. That commit caused a severe memory leak in nfs_readdir_make_qstr(). When listing a directory with more than 100 files (this is how many struct nfs_cache_array_entry elements fit in one 4kB page), all allocated file name strings past those 100 leak. The root of the leakage is that those string pointers are managed in pages which are never linked into the page cache. fs/nfs/dir.c puts pages into the page cache by calling read_cache_page(); the callback function nfs_readdir_filler() will then fill the given page struct which was passed to it, which is already linked in the page cache (by do_read_cache_page() calling add_to_page_cache_lru()). Commit be4c2d4723a4 added another (local) array of allocated pages, to be filled with more data, instead of discarding excess items received from the NFS server. Those additional pages can be used by the next nfs_readdir_filler() call (from within the same nfs_readdir() call). The leak happens when some of those additional pages are never used (copied to the page cache using copy_highpage()). The pages will be freed by nfs_readdir_free_pages(), but their contents will not. The commit did not invoke nfs_readdir_clear_array() (and doing so would have been dangerous, because it did not track which of those pages were already copied to the page cache, risking double free bugs). How to reproduce the leak: - Use a kernel with CONFIG_SLUB_DEBUG_ON. - Create a directory on a NFS mount with more than 100 files with names long enough to use the "kmalloc-32" slab (so we can easily look up the allocation counts): for i in `seq 110`; do touch ${i}_0123456789abcdef; done - Drop all caches: echo 3 >/proc/sys/vm/drop_caches - Check the allocation counter: grep nfs_readdir /sys/kernel/slab/kmalloc-32/alloc_calls 30564391 nfs_readdir_add_to_array+0x73/0xd0 age=534558/4791307/6540952 pid=370-1048386 cpus=0-47 nodes=0-1 - Request a directory listing and check the allocation counters again: ls [...] grep nfs_readdir /sys/kernel/slab/kmalloc-32/alloc_calls 30564511 nfs_readdir_add_to_array+0x73/0xd0 age=207/4792999/6542663 pid=370-1048386 cpus=0-47 nodes=0-1 There are now 120 new allocations. - Drop all caches and check the counters again: echo 3 >/proc/sys/vm/drop_caches grep nfs_readdir /sys/kernel/slab/kmalloc-32/alloc_calls 30564401 nfs_readdir_add_to_array+0x73/0xd0 age=735/4793524/6543176 pid=370-1048386 cpus=0-47 nodes=0-1 110 allocations are gone, but 10 have leaked and will never be freed. Unhelpfully, those allocations are explicitly excluded from KMEMLEAK, that's why my initial attempts with KMEMLEAK were not successful: /* * Avoid a kmemleak false positive. The pointer to the name is stored * in a page cache page which kmemleak does not scan. */ kmemleak_not_leak(string->name); It would be possible to solve this bug without reverting the whole commit: - keep track of which pages were not used, and call nfs_readdir_clear_array() on them, or - manually link those pages into the page cache But for now I have decided to just revert the commit, because the real fix would require complex considerations, risking more dangerous (crash) bugs, which may seem unsuitable for the stable branches. Signed-off-by: Max Kellermann <mk@cm4all.com> Cc: stable@vger.kernel.org # v5.1+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
6619079d |
|
27-Apr-2017 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFSv4: Allow multiple connections to NFSv4.x (x>0) servers If the user specifies the -onconn=<number> mount option, and the transport protocol is TCP, then set up <number> connections to the server. The connections will all go to the same IP address. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
28cc5cd8 |
|
26-Apr-2017 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Add a mount option to specify number of TCP connections to use Allow the user to specify that the client should use multiple connections to the server. For the moment, this functionality will be limited to TCP and to NFSv4.x (x>0). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
10b7a70c |
|
06-Feb-2019 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: Cleanup - add nfs_clients_exit to mirror nfs_clients_init Add a helper to clean up the struct nfs_net when it is being destroyed. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
ca1a199e |
|
15-Apr-2019 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs{,4}: switch to ->free_inode() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
1a58e8a0 |
|
24-Apr-2019 |
Trond Myklebust <trondmy@gmail.com> |
NFS: Store the credential of the mount process in the nfs_server Store the credential of the mount process so that we can determine information such as the user namespace. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
6fbda89b |
|
07-Apr-2019 |
Trond Myklebust <trondmy@gmail.com> |
NFS: Replace custom error reporting mechanism with generic one Replace the NFS custom error reporting mechanism with the generic mapping_set_error(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
11982a7c |
|
07-Apr-2019 |
Trond Myklebust <trondmy@gmail.com> |
NFS: Consider ETIMEDOUT to be a fatal error When we introduce the 'softerr' mount option, we will see the RPC layer returning ETIMEDOUT errors if the server is unresponsive. We want to consider those errors to be fatal on par with the EIO errors that are returned by ordinary 'soft' timeouts.. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
be4c2d47 |
|
29-Jan-2019 |
luanshi <zhangliguang@linux.alibaba.com> |
NFS: readdirplus optimization by cache mechanism When listing very large directories via NFS, clients may take a long time to complete. There are about three factors involved: First of all, ls and practically every other method of listing a directory including python os.listdir and find rely on libc readdir(). However readdir() only reads 32K of directory entries at a time, which means that if you have a lot of files in the same directory, it is going to take an insanely long time to read all the directory entries. Secondly, libc readdir() reads 32K of directory entries at a time, in kernel space 32K buffer split into 8 pages. One NFS readdirplus rpc will be called for one page, which introduces many readdirplus rpc calls. Lastly, one NFS readdirplus rpc asks for 32K data (filled by nfs_dentry) to fill one page (filled by dentry), we found that nearly one third of data was wasted. To solve above problems, pagecache mechanism was introduced. One NFS readdirplus rpc will ask for a large data (more than 32k), the data can fill more than one page, the cached pages can be used for next readdir call. This can reduce many readdirplus rpc calls and improve readdirplus performance. TESTING: When listing very large directories(include 300 thousand files) via NFS time ls -l /nfs_mount | wc -l without the patch: 300001 real 1m53.524s user 0m2.314s sys 0m2.599s with the patch: 300001 real 0m23.487s user 0m2.305s sys 0m2.558s Improved performance: 79.6% readdirplus rpc calls decrease: 85% Signed-off-by: Liguang Zhang <zhangliguang@linux.alibaba.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
2dc23aff |
|
13-Feb-2019 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: ENOMEM should also be a fatal error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
7dc58ca5 |
|
22-Jan-2019 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
NFS: EINTR is also a fatal error. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
10e037d1 |
|
18-Dec-2018 |
Santosh kumar pradhan <santoshkumar.pradhan@wdc.com> |
sunrpc: Add xprt after nfs4_test_session_trunk() Multipathing: In case of NFSv3, rpc_clnt_test_and_add_xprt() adds the xprt to xprt switch (i.e. xps) if rpc_call_null_helper() returns success. But in case of NFSv4.1, it needs to do EXCHANGEID to verify the path along with check for session trunking. Add the xprt in nfs4_test_session_trunk() only when nfs4_detect_session_trunking() returns success. Also release refcount hold by rpc_clnt_setup_test_and_add_xprt(). Signed-off-by: Santosh kumar pradhan <santoshkumar.pradhan@wdc.com> Tested-by: Suresh Jayaraman <suresh.jayaraman@wdc.com> Reported-by: Aditya Agnihotri <aditya.agnihotri@wdc.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
204cc0cc |
|
13-Dec-2018 |
Al Viro <viro@zeniv.linux.org.uk> |
LSM: hide struct security_mnt_opts from any generic code Keep void * instead, allocate on demand (in parse_str_opts, at the moment). Eventually both selinux and smack will be better off with private structures with several strings in those, rather than this "counter and two pointers to dynamically allocated arrays" ugliness. This commit allows to do that at leisure, without disrupting anything outside of given module. Changes: * instead of struct security_mnt_opt use an opaque pointer initialized to NULL. * security_sb_eat_lsm_opts(), security_sb_parse_opts_str() and security_free_mnt_opts() take it as var argument (i.e. as void **); call sites are unchanged. * security_sb_set_mnt_opts() and security_sb_remount() take it by value (i.e. as void *). * new method: ->sb_free_mnt_opts(). Takes void *, does whatever freeing that needs to be done. * ->sb_set_mnt_opts() and ->sb_remount() might get NULL as mnt_opts argument, meaning "empty". Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
a52458b4 |
|
02-Dec-2018 |
NeilBrown <neilb@suse.com> |
NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'. SUNRPC has two sorts of credentials, both of which appear as "struct rpc_cred". There are "generic credentials" which are supplied by clients such as NFS and passed in 'struct rpc_message' to indicate which user should be used to authorize the request, and there are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS which describe the credential to be sent over the wires. This patch replaces all the generic credentials by 'struct cred' pointers - the credential structure used throughout Linux. For machine credentials, there is a special 'struct cred *' pointer which is statically allocated and recognized where needed as having a special meaning. A look-up of a low-level cred will map this to a machine credential. Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
069d5bf5 |
|
26-Oct-2018 |
Olga Kornievskaia <kolga@netapp.com> |
NFSv4: cleanup remove unused nfs4_xdev_fs_type commit e8f25e6d6d19 "NFS: Remove the NFS v4 xdev mount function" removed the last use of this. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
1751e8a6 |
|
27-Nov-2017 |
Linus Torvalds <torvalds@linux-foundation.org> |
Rename superblock flags (MS_xyz -> SB_xyz) This is a pure automated search-and-replace of the internal kernel superblock flags. The s_flags are now called SB_*, with the names and the values for the moment mirroring the MS_* flags that they're equivalent to. Note how the MS_xyz flags are the ones passed to the mount system call, while the SB_xyz flags are what we then use in sb->s_flags. The script to do this was: # places to look in; re security/*: it generally should *not* be # touched (that stuff parses mount(2) arguments directly), but # there are two places where we really deal with superblock flags. FILES="drivers/mtd drivers/staging/lustre fs ipc mm \ include/linux/fs.h include/uapi/linux/bfs_fs.h \ security/apparmor/apparmorfs.c security/apparmor/include/lib.h" # the list of MS_... constants SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \ DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \ POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \ I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \ ACTIVE NOUSER" SED_PROG= for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done # we want files that contain at least one of MS_..., # with fs/namespace.c and fs/pnode.c excluded. L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c') for f in $L; do sed -i $f $SED_PROG; done Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
5e4def20 |
|
02-Nov-2017 |
David Howells <dhowells@redhat.com> |
Pass mode to wait_on_atomic_t() action funcs and provide default actions Make wait_on_atomic_t() pass the TASK_* mode onto its action function as an extra argument and make it 'unsigned int throughout. Also, consolidate a bunch of identical action functions into a default function that can do the appropriate thing for the mode. Also, change the argument name in the bit_wait*() function declarations to reflect the fact that it's the mode and not the bit number. [Peter Z gives this a grudging ACK, but thinks that the whole atomic_t wait should be done differently, though he's not immediately sure as to how] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> cc: Ingo Molnar <mingo@kernel.org>
|
#
b2441318 |
|
01-Nov-2017 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
bf4b4905 |
|
10-Sep-2017 |
NeilBrown <neilb@suse.com> |
NFS: various changes relating to reporting IO errors. 1/ remove 'start' and 'end' args from nfs_file_fsync_commit(). They aren't used. 2/ Make nfs_context_set_write_error() a "static inline" in internal.h so we can... 3/ Use nfs_context_set_write_error() instead of mapping_set_error() if nfs_pageio_add_request() fails before sending any request. NFS generally keeps errors in the open_context, not the mapping, so this is more consistent. 4/ If filemap_write_and_write_range() reports any error, still check ctx->error. The value in ctx->error is likely to be more useful. As part of this, NFS_CONTEXT_ERROR_WRITE is cleared slightly earlier, before nfs_file_fsync_commit() is called, rather than at the start of that function. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
196639eb |
|
08-Sep-2017 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Fix 2 use after free issues in the I/O code The writeback code wants to send a commit after processing the pages, which is why we want to delay releasing the struct path until after that's done. Also, the layout code expects that we do not free the inode before we've put the layout segments in pnfs_writehdr_free() and pnfs_readhdr_free() Fixes: 919e3bd9a875 ("NFS: Ensure we commit after writeback is complete") Fixes: 4714fb51fd03 ("nfs: remove pgio_header refcount, related cleanup") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
20fa1902 |
|
29-Jun-2017 |
Peng Tao <tao.peng@primarydata.com> |
nfs: add export operations This support for opening files on NFS by file handle, both through the open_by_handle syscall, and for re-exporting NFS (for example using a different version). The support is very basic for now, as each open by handle will have to do an NFSv4 open operation on the wire. In the future this will hopefully be mitigated by an open file cache, as well as various optimizations in NFS for this specific case. Signed-off-by: Peng Tao <tao.peng@primarydata.com> [hch: incorporated various changes, resplit the patches, new changelog] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
00422483 |
|
29-Jun-2017 |
Peng Tao <tao.peng@primarydata.com> |
nfs: add export operations This support for opening files on NFS by file handle, both through the open_by_handle syscall, and for re-exporting NFS (for example using a different version). The support is very basic for now, as each open by handle will have to do an NFSv4 open operation on the wire. In the future this will hopefully be mitigated by an open file cache, as well as various optimizations in NFS for this specific case. Signed-off-by: Peng Tao <tao.peng@primarydata.com> [hch: incorporated various changes, resplit the patches, new changelog] Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
a7a3b1e9 |
|
20-Jun-2017 |
Benjamin Coddington <bcodding@redhat.com> |
NFS: convert flags to bool NFS uses some int, and unsigned int :1, and bool as flags in structs and args. Assert the preference for uniformly replacing these with the bool type. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
aa8217d5 |
|
12-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: mark all struct svc_version instances as const Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
511e936b |
|
12-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: mark all struct rpc_procinfo instances as const struct rpc_procinfo contains function pointers, and marking it as constant avoids it being able to be used as an attach vector for code injections. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
5dd43ce2 |
|
19-Jun-2017 |
Ingo Molnar <mingo@kernel.org> |
sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> The wait_bit*() types and APIs are mixed into wait.h, but they are a pretty orthogonal extension of wait-queues. Furthermore, only about 50 kernel files use these APIs, while over 1000 use the regular wait-queue functionality. So clean up the main wait.h by moving the wait-bit functionality out of it, into a separate .h and .c file: include/linux/wait_bit.h for types and APIs kernel/sched/wait_bit.c for the implementation Update all header dependencies. This reduces the size of wait.h rather significantly, by about 30%. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
4f253e1e |
|
15-May-2017 |
Jan Kara <jack@suse.cz> |
nfs: Mark unnecessarily extern functions as static nfs_initialise_sb() and nfs_clone_super() are declared as extern even though they are used only in fs/nfs/super.c. Mark them as static. Also remove explicit 'inline' directive from nfs_initialise_sb() and leave it upto compiler to decide whether inlining is worth it. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
e9679189 |
|
12-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: mark all struct svc_version instances as const Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
499b4988 |
|
12-May-2017 |
Christoph Hellwig <hch@lst.de> |
sunrpc: mark all struct rpc_procinfo instances as const struct rpc_procinfo contains function pointers, and marking it as constant avoids it being able to be used as an attach vector for code injections. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
e0926934 |
|
08-May-2017 |
Olga Kornievskaia <kolga@netapp.com> |
NFS append COMMIT after synchronous COPY Instead of messing with the commit path which has been causing issues, add a COMMIT op after the COPY and ask for stable copies in the first space. It saves a round trip, since after the COPY, the client sends a COMMIT anyway. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
9052c7cf |
|
04-May-2017 |
Jan Kara <jack@suse.cz> |
nfs: Fix bdi handling for cloned superblocks In commit 0d3b12584972 "nfs: Convert to separately allocated bdi" I have wrongly cloned bdi reference in nfs_clone_super(). Further inspection has shown that originally the code was actually allocating a new bdi (in ->clone_server callback) which was later registered in nfs_fs_mount_common() and used for sb->s_bdi in nfs_initialise_sb(). This could later result in bdi for the original superblock not getting unregistered when that superblock got shutdown (as the cloned sb still held bdi reference) and later when a new superblock was created under the same anonymous device number, a clash in sysfs has happened on bdi registration: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 10284 at /linux-next/fs/sysfs/dir.c:31 sysfs_warn_dup+0x64/0x74 sysfs: cannot create duplicate filename '/devices/virtual/bdi/0:32' Modules linked in: axp20x_usb_power gpio_axp209 nvmem_sunxi_sid sun4i_dma sun4i_ss virt_dma CPU: 1 PID: 10284 Comm: mount.nfs Not tainted 4.11.0-rc4+ #14 Hardware name: Allwinner sun7i (A20) Family [<c010f19c>] (unwind_backtrace) from [<c010bc74>] (show_stack+0x10/0x14) [<c010bc74>] (show_stack) from [<c03c6e24>] (dump_stack+0x78/0x8c) [<c03c6e24>] (dump_stack) from [<c0122200>] (__warn+0xe8/0x100) [<c0122200>] (__warn) from [<c0122250>] (warn_slowpath_fmt+0x38/0x48) [<c0122250>] (warn_slowpath_fmt) from [<c02ac178>] (sysfs_warn_dup+0x64/0x74) [<c02ac178>] (sysfs_warn_dup) from [<c02ac254>] (sysfs_create_dir_ns+0x84/0x94) [<c02ac254>] (sysfs_create_dir_ns) from [<c03c8b8c>] (kobject_add_internal+0x9c/0x2ec) [<c03c8b8c>] (kobject_add_internal) from [<c03c8e24>] (kobject_add+0x48/0x98) [<c03c8e24>] (kobject_add) from [<c048d75c>] (device_add+0xe4/0x5a0) [<c048d75c>] (device_add) from [<c048ddb4>] (device_create_groups_vargs+0xac/0xbc) [<c048ddb4>] (device_create_groups_vargs) from [<c048dde4>] (device_create_vargs+0x20/0x28) [<c048dde4>] (device_create_vargs) from [<c02075c8>] (bdi_register_va+0x44/0xfc) [<c02075c8>] (bdi_register_va) from [<c023d378>] (super_setup_bdi_name+0x48/0xa4) [<c023d378>] (super_setup_bdi_name) from [<c0312ef4>] (nfs_fill_super+0x1a4/0x204) [<c0312ef4>] (nfs_fill_super) from [<c03133f0>] (nfs_fs_mount_common+0x140/0x1e8) [<c03133f0>] (nfs_fs_mount_common) from [<c03335cc>] (nfs4_remote_mount+0x50/0x58) [<c03335cc>] (nfs4_remote_mount) from [<c023ef98>] (mount_fs+0x14/0xa4) [<c023ef98>] (mount_fs) from [<c025cba0>] (vfs_kern_mount+0x54/0x128) [<c025cba0>] (vfs_kern_mount) from [<c033352c>] (nfs_do_root_mount+0x80/0xa0) [<c033352c>] (nfs_do_root_mount) from [<c0333818>] (nfs4_try_mount+0x28/0x3c) [<c0333818>] (nfs4_try_mount) from [<c0313874>] (nfs_fs_mount+0x2cc/0x8c4) [<c0313874>] (nfs_fs_mount) from [<c023ef98>] (mount_fs+0x14/0xa4) [<c023ef98>] (mount_fs) from [<c025cba0>] (vfs_kern_mount+0x54/0x128) [<c025cba0>] (vfs_kern_mount) from [<c02600f0>] (do_mount+0x158/0xc7c) [<c02600f0>] (do_mount) from [<c0260f98>] (SyS_mount+0x8c/0xb4) [<c0260f98>] (SyS_mount) from [<c0107840>] (ret_fast_syscall+0x0/0x3c) Fix the problem by always creating new bdi for a superblock as we used to do. Reported-and-tested-by: Corentin Labbe <clabbe.montjoie@gmail.com> Fixes: 0d3b12584972ce5781179ad3f15cca3cdb5cae05 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
54551d85 |
|
25-Apr-2017 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Add a few more fatal I/O errors to nfs_error_is_fatal() EACCES, EDQUOT, EFBIG and ESTALE are all fatal errors as far as NFS I/O is concerned. They need to be reported back to the application. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
0db10944 |
|
11-Apr-2017 |
Jan Kara <jack@suse.cz> |
nfs: Convert to separately allocated bdi Allocate struct backing_dev_info separately instead of embedding it inside the superblock. This unifies handling of bdi among users. CC: Anna Schumaker <anna.schumaker@netapp.com> CC: linux-nfs@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
a33e4b03 |
|
08-Mar-2017 |
Weston Andros Adamson <dros@monkey.org> |
pNFS: return status from nfs4_pnfs_ds_connect The nfs4_pnfs_ds_connect path can call rpc_create which can fail or it can wait on another context to reach the same failure. This checks that the rpc_create succeeded and returns the error to the caller. When an error is returned, both the files and flexfiles layouts will return NULL from _prepare_ds(). The flexfiles layout will also return the layout with the error NFS4ERR_NXIO. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
61540bf6 |
|
08-Dec-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Clean up cache validity checking Consolidate the open-coded checking of NFS_I(inode)->cache_validity into a couple of helper functions. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
1bcf4c5c |
|
02-Dec-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Allow getattr to also report readdirplus cache hits If the use called stat() on an 'ls -l' workload, and the attribute cache was successfully revalidate by READDIRPLUS, then we want to report that back so that the readdir code continues to use readdirplus. Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
7d38de3f |
|
17-Nov-2016 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFS: Remove unused authflavour parameter from nfs_get_client() This parameter hasn't been used since f8407299 (Linux 3.11-rc2), so let's remove it from this function and callers. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
8cd79788 |
|
07-Oct-2016 |
Huang Ying <ying.huang@intel.com> |
mm: remove page_file_index After using the offset of the swap entry as the key of the swap cache, the page_index() becomes exactly same as page_file_index(). So the page_file_index() is removed and the callers are changed to use page_index() instead. Link: http://lkml.kernel.org/r/1473270649-27229-2-git-send-email-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
82c156f8 |
|
22-Sep-2016 |
Al Viro <viro@zeniv.linux.org.uk> |
switch generic_file_splice_read() to use of ->read_iter() ... and kill the ->splice_read() instances that can be switched to it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
bfc505de |
|
15-Sep-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
pNFS: Fix atime updates on pNFS clients Fix the code so that we always mark the atime as invalid in nfs4_read_done(). Currently, the expectation appears to be that the pNFS drivers should always do this, with the result that most of them don't. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
1cd66c93 |
|
27-Sep-2016 |
Miklos Szeredi <mszeredi@redhat.com> |
fs: make remaining filesystems use .rename2 This is trivial to do: - add flags argument to foo_rename() - check if flags is zero - assign foo_rename() to .rename2 instead of .rename This doesn't mean it's impossible to support RENAME_NOREPLACE for these filesystems, but it is not trivial, like for local filesystems. RENAME_NOREPLACE must guarantee atomicity (i.e. it shouldn't be possible for a file to be created on one host while it is overwritten by rename on another host). Filesystems converted: 9p, afs, ceph, coda, ecryptfs, kernfs, lustre, ncpfs, nfs, ocfs2, orangefs. After this, we can get rid of the duplicate interfaces for rename. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: David Howells <dhowells@redhat.com> [AFS] Acked-by: Mike Marshall <hubcap@omnibond.com> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jan Harkes <jaharkes@cs.cmu.edu> Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Oleg Drokin <oleg.drokin@intel.com> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Mark Fasheh <mfasheh@suse.com>
|
#
f844cd0d |
|
19-Sep-2016 |
Chao Yu <chao@kernel.org> |
nfs: cover ->migratepage with CONFIG_MIGRATION It will be more clean to use CONFIG_MIGRATION to cover nfs' private .migratepage in nfs_file_aops like we do in other part of nfs operations. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
04fa2c6b |
|
09-Sep-2016 |
Andy Adamson <andros@netapp.com> |
NFS pnfs data server multipath session trunking Try all multipath addresses for a data server. The first address that successfully connects and creates a session is the DS mount address. All subsequent addresses are tested for session trunking and added as aliases. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
a956beda |
|
16-Aug-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Allow the mount option retrans=0 We should allow retrans=0 as just meaning that every timeout is a major timeout, and that there is no increment in the timeout value. For instance, this means that we would allow TCP users to specify a flat timeout value of 60s, by specifying "timeo=600,retrans=0" in their mount option string. Siged-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
11fb9989 |
|
28-Jul-2016 |
Mel Gorman <mgorman@techsingularity.net> |
mm: move most file-based accounting to the node There are now a number of accounting oddities such as mapped file pages being accounted for on the node while the total number of file pages are accounted on the zone. This can be coped with to some extent but it's confusing so this patch moves the relevant file-based accounted. Due to throttling logic in the page allocator for reliable OOM detection, it is still necessary to track dirty and writeback pages on a per-zone basis. [mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting] Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
b224f7cb |
|
13-Jun-2016 |
Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> |
nfs4: flexfiles: respect noresvport when establishing connections to DSes Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
3fc75f12 |
|
13-Jun-2016 |
Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> |
nfs4: clnt: respect noresvport when establishing connections to DSes result: $ mount -o vers=4.1 dcache-lab007:/ /pnfs $ cp /etc/profile /pnfs tcp 0 0 131.169.185.68:1005 131.169.191.141:32049 ESTABLISHED tcp 0 0 131.169.185.68:751 131.169.191.144:2049 ESTABLISHED $ $ mount -o vers=4.1,noresvport dcache-lab007:/ /pnfs $ cp /etc/profile /pnfs tcp 0 0 131.169.185.68:34894 131.169.191.141:32049 ESTABLISHED tcp 0 0 131.169.185.68:35722 131.169.191.144:2049 ESTABLISHED $ Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@desy.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
ce52914e |
|
07-Jun-2016 |
Scott Mayhew <smayhew@redhat.com> |
sunrpc: move NO_CRKEY_TIMEOUT to the auth->au_flags A generic_cred can be used to look up a unx_cred or a gss_cred, so it's not really safe to use the the generic_cred->acred->ac_flags to store the NO_CRKEY_TIMEOUT flag. A lookup for a unx_cred triggered while the KEY_EXPIRE_SOON flag is already set will cause both NO_CRKEY_TIMEOUT and KEY_EXPIRE_SOON to be set in the ac_flags, leaving the user associated with the auth_cred to be in a state where they're perpetually doing 4K NFS_FILE_SYNC writes. This can be reproduced as follows: 1. Mount two NFS filesystems, one with sec=krb5 and one with sec=sys. They do not need to be the same export, nor do they even need to be from the same NFS server. Also, v3 is fine. $ sudo mount -o v3,sec=krb5 server1:/export /mnt/krb5 $ sudo mount -o v3,sec=sys server2:/export /mnt/sys 2. As the normal user, before accessing the kerberized mount, kinit with a short lifetime (but not so short that renewing the ticket would leave you within the 4-minute window again by the time the original ticket expires), e.g. $ kinit -l 10m -r 60m 3. Do some I/O to the kerberized mount and verify that the writes are wsize, UNSTABLE: $ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1 4. Wait until you're within 4 minutes of key expiry, then do some more I/O to the kerberized mount to ensure that RPC_CRED_KEY_EXPIRE_SOON gets set. Verify that the writes are 4K, FILE_SYNC: $ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1 5. Now do some I/O to the sec=sys mount. This will cause RPC_CRED_NO_CRKEY_TIMEOUT to be set: $ dd if=/dev/zero of=/mnt/sys/file bs=1M count=1 6. Writes for that user will now be permanently 4K, FILE_SYNC for that user, regardless of which mount is being written to, until you reboot the client. Renewing the kerberos ticket (assuming it hasn't already expired) will have no effect. Grabbing a new kerberos ticket at this point will have no effect either. Move the flag to the auth->au_flags field (which is currently unused) and rename it slightly to reflect that it's no longer associated with the auth_cred->ac_flags. Add the rpc_auth to the arg list of rpcauth_cred_key_to_expire and check the au_flags there too. Finally, add the inode to the arg list of nfs_ctx_key_to_expire so we can determine the rpc_auth to pass to rpcauth_cred_key_to_expire. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
837bb1d7 |
|
25-Jun-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFSv4.2: Fix writeback races in nfs4_copy_file_range We need to ensure that any writes to the destination file are serialised with the copy, meaning that the writeback has to occur under the inode lock. Also relax the writeback requirement on the source, and rely on the stateid checking to tell us if the source rebooted. Add the helper nfs_filemap_write_and_wait_range() to call pnfs_sync_inode() as is appropriate for pNFS servers that may need a layoutcommit. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
651b0e70 |
|
25-Jun-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Do not aggressively cache file attributes in the case of O_DIRECT A file that is open for O_DIRECT is by definition not obeying close-to-open cache consistency semantics, so let's not cache the attributes too aggressively either. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
a5864c99 |
|
03-Jun-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Do not serialise O_DIRECT reads and writes Allow dio requests to be scheduled in parallel, but ensuring that they do not conflict with buffered I/O. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
a5314a74 |
|
01-Jun-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Ensure we reset the write verifier 'committed' value on resend. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
8fc3c386 |
|
01-Jun-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Fix O_DIRECT verifier problems We should not be interested in looking at the value of the stable field, since that could take any value. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
5c6e5b60 |
|
22-Jun-2016 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Fix an Oops in the pNFS files and flexfiles connection setup to the DS Chris Worley reports: RIP: 0010:[<ffffffffa0245f80>] [<ffffffffa0245f80>] rpc_new_client+0x2a0/0x2e0 [sunrpc] RSP: 0018:ffff880158f6f548 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880234f8bc00 RCX: 000000000000ea60 RDX: 0000000000074cc0 RSI: 000000000000ea60 RDI: ffff880234f8bcf0 RBP: ffff880158f6f588 R08: 000000000001ac80 R09: ffff880237003300 R10: ffff880201171000 R11: ffffea0000d75200 R12: ffffffffa03afc60 R13: ffff880230c18800 R14: 0000000000000000 R15: ffff880158f6f680 FS: 00007f0e32673740(0000) GS:ffff88023fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000008 CR3: 0000000234886000 CR4: 00000000001406e0 Stack: ffffffffa047a680 0000000000000000 ffff880158f6f598 ffff880158f6f680 ffff880158f6f680 ffff880234d11d00 ffff88023357f800 ffff880158f6f7d0 ffff880158f6f5b8 ffffffffa024660a ffff880158f6f5b8 ffffffffa02492ec Call Trace: [<ffffffffa024660a>] rpc_create_xprt+0x1a/0xb0 [sunrpc] [<ffffffffa02492ec>] ? xprt_create_transport+0x13c/0x240 [sunrpc] [<ffffffffa0246766>] rpc_create+0xc6/0x1a0 [sunrpc] [<ffffffffa038e695>] nfs_create_rpc_client+0xf5/0x140 [nfs] [<ffffffffa038f31a>] nfs_init_client+0x3a/0xd0 [nfs] [<ffffffffa038f22f>] nfs_get_client+0x25f/0x310 [nfs] [<ffffffffa025cef8>] ? rpc_ntop+0xe8/0x100 [sunrpc] [<ffffffffa047512c>] nfs3_set_ds_client+0xcc/0x100 [nfsv3] [<ffffffffa041fa10>] nfs4_pnfs_ds_connect+0x120/0x400 [nfsv4] [<ffffffffa03d41c7>] nfs4_ff_layout_prepare_ds+0xe7/0x330 [nfs_layout_flexfiles] [<ffffffffa03d1b1b>] ff_layout_pg_init_write+0xcb/0x280 [nfs_layout_flexfiles] [<ffffffffa03a14dc>] __nfs_pageio_add_request+0x12c/0x490 [nfs] [<ffffffffa03a1fa2>] nfs_pageio_add_request+0xc2/0x2a0 [nfs] [<ffffffffa03a0365>] ? nfs_pageio_init+0x75/0x120 [nfs] [<ffffffffa03a5b50>] nfs_do_writepage+0x120/0x270 [nfs] [<ffffffffa03a5d31>] nfs_writepage_locked+0x61/0xc0 [nfs] [<ffffffff813d4115>] ? __percpu_counter_add+0x55/0x70 [<ffffffffa03a6a9f>] nfs_wb_single_page+0xef/0x1c0 [nfs] [<ffffffff811ca4a3>] ? __dec_zone_page_state+0x33/0x40 [<ffffffffa0395b21>] nfs_launder_page+0x41/0x90 [nfs] [<ffffffff811baba0>] invalidate_inode_pages2_range+0x340/0x3a0 [<ffffffff811bac17>] invalidate_inode_pages2+0x17/0x20 [<ffffffffa039960e>] nfs_release+0x9e/0xb0 [nfs] [<ffffffffa0399570>] ? nfs_open+0x60/0x60 [nfs] [<ffffffffa0394dad>] nfs_file_release+0x3d/0x60 [nfs] [<ffffffff81226e6c>] __fput+0xdc/0x1e0 [<ffffffff81226fbe>] ____fput+0xe/0x10 [<ffffffff810bf2e4>] task_work_run+0xc4/0xe0 [<ffffffff810a4188>] do_exit+0x2e8/0xb30 [<ffffffff8102471c>] ? do_audit_syscall_entry+0x6c/0x70 [<ffffffff811464e6>] ? __audit_syscall_exit+0x1e6/0x280 [<ffffffff810a4a5f>] do_group_exit+0x3f/0xa0 [<ffffffff810a4ad4>] SyS_exit_group+0x14/0x20 [<ffffffff8179b76e>] system_call_fastpath+0x12/0x71 Which seems to be due to a call to utsname() when in a task exit context in order to determine the hostname to set in rpc_new_client(). In reality, what we want here is not the hostname of the current task, but the hostname that was used to set up the metadata server. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
67911c8f |
|
19-Jan-2016 |
Anna Schumaker <Anna.Schumaker@Netapp.com> |
NFS: Add nfs_commit_file() Copy will use this to set up a commit request for a generic range. I don't want to allocate a new pagecache entry for the file, so I needed to change parts of the commit path to handle requests with a null wb_page. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
09cbfeaf |
|
01-Apr-2016 |
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> |
mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
95d9f6c3 |
|
02-Mar-2016 |
Christoph Hellwig <hch@lst.de> |
nfs: remove nfs_inode_dio_wait Just call inode_dio_wait directly instead of through a pointless wrapper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
4ff79bc7 |
|
02-Mar-2016 |
Christoph Hellwig <hch@lst.de> |
nfs: remove nfs4_file_fsync The only difference to nfs_file_fsync is the call to pnfs_sync_inode. But pnfs_sync_inode is just an inline that calls a pNFS layout driver method if CONFIG_PNFS is designed, and thus can be called just fine from the core NFS module. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
6272dcc6 |
|
15-Jan-2016 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Simplify nfs_request_add_commit_list() arguments I noticed that all the callers of this function pass cinfo->mds->list as an argument in addition to the cinfo structure itself. Let's get rid of the extra argument, since it doesn't seem to be adding anything. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
210c7c17 |
|
06-Jan-2016 |
Benjamin Coddington <bcodding@redhat.com> |
NFS: Use wait_on_atomic_t() for unlock after readahead The use of wait_on_atomic_t() for waiting on I/O to complete before unlocking allows us to git rid of the NFS_IO_INPROGRESS flag, and thus the nfs_iocounter's flags member, and finally the nfs_iocounter altogether. The count of I/O is moved to the lock context, and the counter increment/decrement functions become simple enough to open-code. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> [Trond: Fix up conflict with existing function nfs_wait_atomic_killable()] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
138a2935 |
|
01-Oct-2015 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Relax requirements in nfs_flush_incompatible If two processes share the same credentials and NFSv4 open stateid, then allow them both to dirty the same page, even if their nfs_open_context differs. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
dc602dd7 |
|
31-Dec-2015 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS/pNFS: Fix up pNFS write reschedule layering violations and bugs The flexfiles layout in particular, seems to want to poke around in the O_DIRECT flags when retransmitting. This patch sets up an interface to allow it to call back into O_DIRECT to handle retransmission correctly. It also fixes a potential bug whereby we could change the behaviour of O_DIRECT if an error is already pending. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
0bcbf039 |
|
05-Dec-2015 |
Peng Tao <tao.peng@primarydata.com> |
nfs: handle request add failure properly When we fail to queue a read page to IO descriptor, we need to clean it up otherwise it is hanging around preventing nfs module from being removed. When we fail to queue a write page to IO descriptor, we need to clean it up and also save the failure status to open context. Then at file close, we can try to write pages back again and drop the page if it fails to writeback in .launder_page, which will be done in the next patch. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
48c9579a |
|
24-Nov-2015 |
Olga Kornievskaia <kolga@netapp.com> |
Adding stateid information to tracepoints Operations to which stateid information is added: close, delegreturn, open, read, setattr, layoutget, layoutcommit, test_stateid, write, lock, locku, lockt Format is "stateid=<seqid>:<crc32 hash stateid.other>", also "openstateid=", "layoutstateid=", and "lockstateid=" for open_file, layoutget, set_lock tracepoints. New function is added to internal.h, nfs_stateid_hash(), to compute the hash trace_nfs4_setattr() is moved from nfs4_do_setattr() to _nfs4_do_setattr() to get access to stateid. trace_nfs4_setattr and trace_nfs4_delegreturn are changed from INODE_EVENT to new event type, INODE_STATEID_EVENT which is same as INODE_EVENT but adds stateid information for locking tracepoints, moved trace_nfs4_set_lock() into _nfs4_do_setlk() to get access to stateid information, and removed trace_nfs4_lock_reclaim(), trace_nfs4_lock_expired() as they call into _nfs4_do_setlk() and both were previously same LOCK_EVENT type. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
dfd01f02 |
|
13-Dec-2015 |
Peter Zijlstra <peterz@infradead.org> |
sched/wait: Fix the signal handling fix Jan Stancek reported that I wrecked things for him by fixing things for Vladimir :/ His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which should not be possible, however my previous patch made this possible by unconditionally checking signal_pending(). We cannot use current->state as was done previously, because the instruction after the store to that variable it can be changed. We must instead pass the initial state along and use that. Fixes: 68985633bccb ("sched/wait: Fix signal handling in bit wait helpers") Reported-by: Jan Stancek <jstancek@redhat.com> Reported-by: Chris Mason <clm@fb.com> Tested-by: Jan Stancek <jstancek@redhat.com> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Tested-by: Chris Mason <clm@fb.com> Reviewed-by: Paul Turner <pjt@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: tglx@linutronix.de Cc: Oleg Nesterov <oleg@redhat.com> Cc: hpa@zytor.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
5445b1fb |
|
05-Sep-2015 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFSv4: Respect the server imposed limit on how many changes we may cache The NFSv4 delegation spec allows the server to tell a client to limit how much data it cache after the file is closed. In return, the server guarantees enough free space to avoid ENOSPC situations, etc. Prior to this patch, we assumed we could always cache aggressively after close. Unfortunately, this causes problems with servers that set the limit to 0 and therefore do not offer any ENOSPC guarantees. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
d8efa4e6 |
|
13-Jul-2015 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Use RPC functions for matching sockaddrs They already exist and do the exact same thing. Let's save ourselves several lines of code! Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
86d80f97 |
|
31-Jul-2015 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFSv4.1/pnfs: Fix atomicity of commit list updates pnfs_layout_mark_request_commit() needs to ensure that it adds the request to the commit list atomically with all the other updates in order to prevent corruption to buckets[ds_commit_idx].wlseg due to races with pnfs_generic_clear_request_commit(). Fixes: 338d00cfef07d ("pnfs: Refactor the *_layout_mark_request_commit...") Cc: stable@vger.kernel.org # v4.0+ Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
a49c2691 |
|
27-Jul-2015 |
Kinglong Mee <kinglongmee@gmail.com> |
nfs: Fix an oops caused by using other thread's stack space in ASYNC mode An oops caused by using other thread's stack space in sunrpc ASYNC sending thread. [ 9839.007187] ------------[ cut here ]------------ [ 9839.007923] kernel BUG at fs/nfs/nfs4xdr.c:910! [ 9839.008069] invalid opcode: 0000 [#1] SMP [ 9839.008069] Modules linked in: blocklayoutdriver rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm joydev iosf_mbi crct10dif_pclmul snd_timer crc32_pclmul crc32c_intel ghash_clmulni_intel snd soundcore ppdev pvpanic parport_pc i2c_piix4 serio_raw virtio_balloon parport acpi_cpufreq nfsd nfs_acl lockd grace auth_rpcgss sunrpc qxl drm_kms_helper virtio_net virtio_console virtio_blk ttm drm virtio_pci virtio_ring virtio ata_generic pata_acpi [ 9839.008069] CPU: 0 PID: 308 Comm: kworker/0:1H Not tainted 4.0.0-0.rc4.git1.3.fc23.x86_64 #1 [ 9839.008069] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 9839.008069] Workqueue: rpciod rpc_async_schedule [sunrpc] [ 9839.008069] task: ffff8800d8b4d8e0 ti: ffff880036678000 task.ti: ffff880036678000 [ 9839.008069] RIP: 0010:[<ffffffffa0339cc9>] [<ffffffffa0339cc9>] reserve_space.part.73+0x9/0x10 [nfsv4] [ 9839.008069] RSP: 0018:ffff88003667ba58 EFLAGS: 00010246 [ 9839.008069] RAX: 0000000000000000 RBX: 000000001fc15e18 RCX: ffff8800c0193800 [ 9839.008069] RDX: ffff8800e4ae3f24 RSI: 000000001fc15e2c RDI: ffff88003667bcd0 [ 9839.008069] RBP: ffff88003667ba58 R08: ffff8800d9173008 R09: 0000000000000003 [ 9839.008069] R10: ffff88003667bcd0 R11: 000000000000000c R12: 0000000000010000 [ 9839.008069] R13: ffff8800d9173350 R14: 0000000000000000 R15: ffff8800c0067b98 [ 9839.008069] FS: 0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000 [ 9839.008069] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9839.008069] CR2: 00007f988c9c8bb0 CR3: 00000000d99b6000 CR4: 00000000000407f0 [ 9839.008069] Stack: [ 9839.008069] ffff88003667bbc8 ffffffffa03412c5 00000000c6c55680 ffff880000000003 [ 9839.008069] 0000000000000088 00000010c6c55680 0001000000000002 ffffffff816e87e9 [ 9839.008069] 0000000000000000 00000000477290e2 ffff88003667bab8 ffffffff81327ba3 [ 9839.008069] Call Trace: [ 9839.008069] [<ffffffffa03412c5>] encode_attrs+0x435/0x530 [nfsv4] [ 9839.008069] [<ffffffff816e87e9>] ? inet_sendmsg+0x69/0xb0 [ 9839.008069] [<ffffffff81327ba3>] ? selinux_socket_sendmsg+0x23/0x30 [ 9839.008069] [<ffffffff8164c1df>] ? do_sock_sendmsg+0x9f/0xc0 [ 9839.008069] [<ffffffff8164c278>] ? kernel_sendmsg+0x58/0x70 [ 9839.008069] [<ffffffffa011acc0>] ? xdr_reserve_space+0x20/0x170 [sunrpc] [ 9839.008069] [<ffffffffa011acc0>] ? xdr_reserve_space+0x20/0x170 [sunrpc] [ 9839.008069] [<ffffffffa0341b40>] ? nfs4_xdr_enc_open_noattr+0x130/0x130 [nfsv4] [ 9839.008069] [<ffffffffa03419a5>] encode_open+0x2d5/0x340 [nfsv4] [ 9839.008069] [<ffffffffa0341b40>] ? nfs4_xdr_enc_open_noattr+0x130/0x130 [nfsv4] [ 9839.008069] [<ffffffffa011ab89>] ? xdr_encode_opaque+0x19/0x20 [sunrpc] [ 9839.008069] [<ffffffffa0339cfb>] ? encode_string+0x2b/0x40 [nfsv4] [ 9839.008069] [<ffffffffa0341bf3>] nfs4_xdr_enc_open+0xb3/0x140 [nfsv4] [ 9839.008069] [<ffffffffa0110a4c>] rpcauth_wrap_req+0xac/0xf0 [sunrpc] [ 9839.008069] [<ffffffffa01017db>] call_transmit+0x18b/0x2d0 [sunrpc] [ 9839.008069] [<ffffffffa0101650>] ? call_decode+0x860/0x860 [sunrpc] [ 9839.008069] [<ffffffffa0101650>] ? call_decode+0x860/0x860 [sunrpc] [ 9839.008069] [<ffffffffa010caa0>] __rpc_execute+0x90/0x460 [sunrpc] [ 9839.008069] [<ffffffffa010ce85>] rpc_async_schedule+0x15/0x20 [sunrpc] [ 9839.008069] [<ffffffff810b452b>] process_one_work+0x1bb/0x410 [ 9839.008069] [<ffffffff810b47d3>] worker_thread+0x53/0x470 [ 9839.008069] [<ffffffff810b4780>] ? process_one_work+0x410/0x410 [ 9839.008069] [<ffffffff810b4780>] ? process_one_work+0x410/0x410 [ 9839.008069] [<ffffffff810ba7b8>] kthread+0xd8/0xf0 [ 9839.008069] [<ffffffff810ba6e0>] ? kthread_worker_fn+0x180/0x180 [ 9839.008069] [<ffffffff81786418>] ret_from_fork+0x58/0x90 [ 9839.008069] [<ffffffff810ba6e0>] ? kthread_worker_fn+0x180/0x180 [ 9839.008069] Code: 00 00 48 c7 c7 21 fa 37 a0 e8 94 1c d6 e0 c6 05 d2 17 05 00 01 8b 03 eb d7 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 <0f> 0b 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 41 54 53 89 f3 [ 9839.008069] RIP [<ffffffffa0339cc9>] reserve_space.part.73+0x9/0x10 [nfsv4] [ 9839.008069] RSP <ffff88003667ba58> [ 9839.071114] ---[ end trace cc14c03adb522e94 ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
93f78d88 |
|
22-May-2015 |
Tejun Heo <tj@kernel.org> |
writeback: move backing_dev_info->bdi_stat[] into bdi_writeback Currently, a bdi (backing_dev_info) embeds single wb (bdi_writeback) and the role of the separation is unclear. For cgroup support for writeback IOs, a bdi will be updated to host multiple wb's where each wb serves writeback IOs of a different cgroup on the bdi. To achieve that, a wb should carry all states necessary for servicing writeback IOs for a cgroup independently. This patch moves bdi->bdi_stat[] into wb. * enum bdi_stat_item is renamed to wb_stat_item and the prefix of all enums is changed from BDI_ to WB_. * BDI_STAT_BATCH() -> WB_STAT_BATCH() * [__]{add|inc|dec|sum}_wb_stat(bdi, ...) -> [__]{add|inc}_wb_stat(wb, ...) * bdi_stat[_error]() -> wb_stat[_error]() * bdi_writeout_inc() -> wb_writeout_inc() * stat init is moved to bdi_wb_init() and bdi_wb_exit() is added and frees stat. * As there's still only one bdi_writeback per backing_dev_info, all uses of bdi->stat[] are mechanically replaced with bdi->wb.stat[] introducing no behavior changes. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
a08a8cd3 |
|
26-Feb-2015 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Add attribute update barriers to NFS writebacks Ensure that other operations that race with our write RPC calls cannot revert the file size updates that were made on the server. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Tested-by: Chuck Lever <chuck.lever@oracle.com>
|
#
d15bc38d |
|
13-Feb-2015 |
Tom Haynes <thomas.haynes@primarydata.com> |
nfs: Provide and use helper functions for marking a page as unstable Signed-off-by: Tom Haynes <loghyr@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
ea7c38fe |
|
05-Feb-2015 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFSv4: Ensure we reference the inode for return-on-close in delegreturn If we have to do a return-on-close in the delegreturn code, then we must ensure that the inode and super block remain referenced. Cc: Peng Tao <tao.peng@primarydata.com> Cc: stable@vger.kernel.org # 3.17.x Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Peng Tao <tao.peng@primarydata.com>
|
#
012fa16d |
|
30-Nov-2014 |
Peng Tao <tao.peng@primarydata.com> |
nfs: add a helper to set NFS_ODIRECT_RESCHED_WRITES to direct writes To allow pnfs LD to ask direct writes to be resend. Signed-off-by: Peng Tao <tao.peng@primarydata.com>
|
#
48d635f1 |
|
09-Nov-2014 |
Peng Tao <tao.peng@primarydata.com> |
nfs: add nfs_pgio_current_mirror helper Let it return current nfs_pgio_mirror in use depending on pg_mirror_count. For read, we always use pg_mirrors[0], so this effectively gives us freedom to use pg_mirror_idx to track the actual mirror to read from through out the IO stack. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
|
#
47af81f2 |
|
09-Nov-2014 |
Peng Tao <tao.peng@primarydata.com> |
nfs: only reset desc->pg_mirror_idx when mirroring is supported so that we don't reset desc->pg_mirror_idx for read unnecessarily. Remove WARN_ON_ONCE from __nfs_pageio_add_request to allow LD to set pg_mirror_idx for read where pg_mirror_count is always 1. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <loghyr@primarydata.com>
|
#
a7d42ddb |
|
19-Sep-2014 |
Weston Andros Adamson <dros@primarydata.com> |
nfs: add mirroring support to pgio layer This patch adds mirrored write support to the pgio layer. The default is to use one mirror, but pgio callers may define callbacks to change this to any value up to the (arbitrarily selected) limit of 16. The basic idea is to break out members of nfs_pageio_descriptor that cannot be shared between mirrored DSes and put them in a new structure. Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
|
#
b57ff130 |
|
05-Sep-2014 |
Weston Andros Adamson <dros@primarydata.com> |
pnfs: pass ds_commit_idx through the commit path Pass ds_commit_idx through the nfs commit path. It's used to select the commit bucket when using pnfs and is ignored when not using pnfs. Several functions had to be changed: nfs_retry_commit, nfs_mark_request_commit, pnfs_mark_request_commit and the pnfs layout driver .mark_request_commit functions. Signed-off-by: Tom Haynes <loghyr@primarydata.com>
|
#
46a5ab47 |
|
13-Jun-2014 |
Peng Tao <tao.peng@primarydata.com> |
nfs: allow to specify cred in nfs_initiate_pgio so that flexfile layout client can pass in DS credential instead of using user cred, which will be done in the next patch. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
|
#
c36aae9a |
|
08-Jun-2014 |
Peng Tao <tao.peng@primarydata.com> |
nfs: allow different protocol in nfs_initiate_commit pnfs flexfile layout client may want to use NFSv3 ops rather than the default MDS v4 ops. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
|
#
abde71f4 |
|
09-Jun-2014 |
Tom Haynes <loghyr@primarydata.com> |
pnfs: Add nfs_rpc_ops in calls to nfs_initiate_pgio Signed-off-by: Tom Haynes <loghyr@primarydata.com>
|
#
30626f9c |
|
30-May-2014 |
Peng Tao <tao.peng@primarydata.com> |
nfs41: allow LD to choose DS connection version/minor_version flexfile layout may need to set such when making DS connections. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
|
#
1a04c6e1 |
|
30-May-2014 |
Peng Tao <tao.peng@primarydata.com> |
nfsv3: introduce nfs3_set_ds_client The flexfiles layout wants to create DS connection over NFSv3. Add nfs3_set_ds_client to allow that to happen. Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
|
#
064172f3 |
|
29-May-2014 |
Peng Tao <tao.peng@primarydata.com> |
nfs41: allow LD to choose DS connection auth flavor flexfile layout may use different auth flavor as specified by MDS. Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Peng Tao <tao.peng@primarydata.com> Signed-off-by: Tom Haynes <Thomas.Haynes@primarydata.com>
|
#
2ef47eb1 |
|
09-Dec-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Fix use of nfs_attr_use_mounted_on_fileid() This function call was being optimized out during nfs_fhget(), leading to situations where we have a valid fileid but still want to use the mounted_on_fileid. For example, imagine we have our server configured like this: server % df Filesystem Size Used Avail Use% Mounted on /dev/vda1 9.1G 6.5G 1.9G 78% / /dev/vdb1 487M 2.3M 456M 1% /exports /dev/vdc1 487M 2.3M 456M 1% /exports/vol1 /dev/vdd1 487M 2.3M 456M 1% /exports/vol2 If our client mounts /exports and tries to do a "chown -R" across the entire mountpoint, we will get a nasty message warning us about a circular directory structure. Running chown with strace tells me that each directory has the same device and inode number: newfstatat(AT_FDCWD, "/nfs/", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0 newfstatat(4, "vol1", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0 newfstatat(4, "vol2", {st_dev=makedev(0, 38), st_ino=2, ...}) = 0 With this patch the mounted_on_fileid values are used for st_ino, so the directory loop warning isn't reported. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
7b14a213 |
|
14-Jan-2015 |
Christoph Hellwig <hch@lst.de> |
nfs: don't call bdi_unregister bdi_destroy already does all the work, and if we delay freeing the anon bdev we can get away with just that single call. Addintionally remove the call during mount failure, as deactivate_super_locked will already call ->kill_sb and clean up the bdi for us. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
|
#
00a36a10 |
|
02-Sep-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Move v3 declarations out of internal.h I am generally against the "one big header file" approach, and everything in the client includes this file. Let's move all the NFS v3 declarations into a v3-only header file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
1c994a09 |
|
27-Aug-2014 |
Jeff Layton <jlayton@kernel.org> |
locks: consolidate "nolease" routines GFS2 and NFS have setlease routines that always just return -EINVAL. Turn that into a generic routine that can live in fs/libfs.c. Cc: <linux-nfs@vger.kernel.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: <cluster-devel@redhat.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Acked-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
65b38851 |
|
31-Jul-2014 |
Eric W. Biederman <ebiederm@xmission.com> |
NFS: Fix /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes The usage of pid_ns->child_reaper->nsproxy->net_ns in nfs_server_list_open and nfs_client_list_open is not safe. /proc for a pid namespace can remain mounted after the all of the process in that pid namespace have exited. There are also times before the initial process in a pid namespace has started or after the initial process in a pid namespace has exited where pid_ns->child_reaper can be NULL or stale. Making the idiom pid_ns->child_reaper->nsproxy a double whammy of problems. Luckily all that needs to happen is to move /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes under /proc/net to /proc/net/nfsfs/servers and /proc/net/nfsfs/volumes and add a symlink from the original location, and to use seq_open_net as it has been designed. Cc: stable@vger.kernel.org Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
#
c1221321 |
|
06-Jul-2014 |
NeilBrown <neilb@suse.de> |
sched: Allow wait_on_bit_action() functions to support a timeout It is currently not possible for various wait_on_bit functions to implement a timeout. While the "action" function that is called to do the waiting could certainly use schedule_timeout(), there is no way to carry forward the remaining timeout after a false wake-up. As false-wakeups a clearly possible at least due to possible hash collisions in bit_waitqueue(), this is a real problem. The 'action' function is currently passed a pointer to the word containing the bit being waited on. No current action functions use this pointer. So changing it to something else will be a little noisy but will have no immediate effect. This patch changes the 'action' function to take a pointer to the "struct wait_bit_key", which contains a pointer to the word containing the bit so nothing is really lost. It also adds a 'private' field to "struct wait_bit_key", which is initialized to zero. An action function can now implement a timeout with something like static int timed_out_waiter(struct wait_bit_key *key) { unsigned long waited; if (key->private == 0) { key->private = jiffies; if (key->private == 0) key->private -= 1; } waited = jiffies - key->private; if (waited > 10 * HZ) return -EAGAIN; schedule_timeout(waited - 10 * HZ); return 0; } If any other need for context in a waiter were found it would be easy to use ->private for some other purpose, or even extend "struct wait_bit_key". My particular need is to support timeouts in nfs_release_page() to avoid deadlocks with loopback mounted NFS. While wait_on_bit_timeout() would be a cleaner interface, it will not meet my need. I need the timeout to be sensitive to the state of the connection with the server, which could change. So I need to use an 'action' interface. Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: David Howells <dhowells@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brown Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
#
d4581383 |
|
11-Jul-2014 |
Weston Andros Adamson <dros@primarydata.com> |
nfs: handle multiple reqs in nfs_page_async_flush Change nfs_find_and_lock_request so nfs_page_async_flush can handle multiple requests in a page. There is only one request for a page the first time nfs_page_async_flush is called, but if a write or commit fails, async_flush is called again and there may be multiple requests associated with the page. The solution is to merge all the requests in a page group into a single request before calling nfs_pageio_add_request. Rename nfs_find_and_lock_request to nfs_lock_and_join_requests and change it to first lock all requests for the page, then cancel and merge all subrequests into the head request. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
c65e6254 |
|
09-Jun-2014 |
Weston Andros Adamson <dros@primarydata.com> |
nfs: remove unused writeverf code Remove duplicate writeverf structure from merge of nfs_pgio_header and nfs_pgio_data and remove writeverf related flags and logic to handle more than one RPC per nfs_pgio_header. Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
d45f60c6 |
|
09-Jun-2014 |
Weston Andros Adamson <dros@primarydata.com> |
nfs: merge nfs_pgio_data into _header struct nfs_pgio_data only exists as a member of nfs_pgio_header, but is passed around everywhere, because there used to be multiple _data structs per _header. Many of these functions then use the _data to find a pointer to the _header. This patch cleans this up by merging the nfs_pgio_data structure into nfs_pgio_header and passing nfs_pgio_header around instead. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
1e7f3a48 |
|
09-Jun-2014 |
Weston Andros Adamson <dros@primarydata.com> |
nfs: move nfs_pgio_data and remove nfs_rw_header nfs_rw_header was used to allocate an nfs_pgio_header along with an nfs_pgio_data, because a _header would need at least one _data. Now there is only ever one nfs_pgio_data for each nfs_pgio_header -- move it to nfs_pgio_header and get rid of nfs_rw_header. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
4da54c21 |
|
05-Apr-2014 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: switch to iter_splice_write_file() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
41d8d5b7 |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common nfs_pageio_ops struct At this point the read and write structures look identical, so combine them into something shared by both. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
cf485fcd |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common generic_pg_pgios() What we have here is two functions that look identical. Let's share some more code! Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
c3766276 |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common multiple_pgios() function Once again, these two functions look identical in the read and write case. Time to combine them together! Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
1ed26f33 |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common initiate_pgio() function Most of this code is the same for both the read and write paths, so combine everything and use the rw_ops when necessary. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
ef2c488c |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a generic_pgio function These functions are almost identical on both the read and write side. FLUSH_COND_STABLE will never be set for the read path, so leaving it in the generic code won't hurt anything. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
844c9e69 |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common pgio_error function At this point, the read and write versions of this function look identical so both should use the same function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
ce59515c |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common rpcsetup function for reads and writes Write adds a little bit of code dealing with flush flags, but since "how" will always be 0 when reading we can share the code. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
6f92fa45 |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common rpc_call_ops struct The read and write paths set up this struct in exactly the same way, so create a single shared struct. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
0eecb214 |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common nfs_pgio_result_common function Combining these functions will let me make a single nfs_rw_common_ops struct (see the next patch). Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
a4cdda59 |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common pgio_rpc_prepare function The read and write paths do exactly the same thing for the rpc_prepare rpc_op. This patch combines them together into a single function. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
4a0de55c |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common rw_header_alloc and rw_header_free function I create a new struct nfs_rw_ops to decide the differences between reads and writes. This struct will be set when initializing a new nfs_pgio_descriptor, and then passed on to the nfs_rw_header when a new header is allocated. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
00bfa30a |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common pgio_alloc and pgio_release function These functions are identical for the read and write paths so they can be combined. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
c0752cdf |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common read and write header struct The only difference is the write verifier field, but we can keep that for a little bit longer. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
9c7e1b3d |
|
06-May-2014 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
NFS: Create a common read and write data struct At this point, the only difference between nfs_read_data and nfs_write_data is the write verifier. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
fab5fc25 |
|
16-Apr-2014 |
Christoph Hellwig <hch@lst.de> |
nfs: remove ->read_pageio_init from rpc ops The read_pageio_init method is just a very convoluted way to grab the right nfs_pageio_ops vector. The vector to chose is not a choice of protocol version, but just a pNFS vs MDS I/O choice that can simply be done inside nfs_pageio_init_read based on the presence of a layout driver, and a new force_mds flag to the special case of falling back to MDS I/O on a pNFS-capable volume. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
a20c93e3 |
|
16-Apr-2014 |
Christoph Hellwig <hch@lst.de> |
nfs: remove ->write_pageio_init from rpc ops The write_pageio_init method is just a very convoluted way to grab the right nfs_pageio_ops vector. The vector to chose is not a choice of protocol version, but just a pNFS vs MDS I/O choice that can simply be done inside nfs_pageio_init_write based on the presence of a layout driver, and a new force_mds flag to the special case of falling back to MDS I/O on a pNFS-capable volume. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
edaf4369 |
|
03-Apr-2014 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: switch to ->write_iter() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
3aa2d199 |
|
02-Apr-2014 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: switch to ->read_iter() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
0e862a40 |
|
17-Mar-2014 |
Jeff Layton <jlayton@kernel.org> |
nfs: make nfs_async_rename non-static ...and move the prototype for nfs_sillyrename to internal.h. Signed-off-by: Jeff Layton <jlayton@redhat.com> Tested-by: Anna Schumaker <Anna.Schumaker@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
292f503c |
|
16-Feb-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFSv4: Use the correct net namespace in nfs4_update_server We need to use the same net namespace that was used to resolve the hostname and sockaddr arguments. Fixes: 32e62b7c3ef09 (NFS: Add nfs4_update_server) Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
311324ad |
|
07-Feb-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Be more aggressive in using readdirplus for 'ls -l' situations Try to detect 'ls -l' by having nfs_getattr() look at whether or not there is an opendir() file descriptor for the parent directory. If so, then assume that we want to force use of readdirplus in order to avoid the multiple GETATTR calls over the wire. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
fd1defc2 |
|
06-Feb-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFS: Do not set NFS_INO_INVALID_LABEL unless server supports labeled NFS Commit aa9c2669626c (NFS: Client implementation of Labeled-NFS) introduces a performance regression. When nfs_zap_caches_locked is called, it sets the NFS_INO_INVALID_LABEL flag irrespectively of whether or not the NFS server supports security labels. Since that flag is never cleared, it means that all calls to nfs_revalidate_inode() will now trigger an on-the-wire GETATTR call. This patch ensures that we never set the NFS_INO_INVALID_LABEL unless the server advertises support for labeled NFS. It also causes nfs_setsecurity() to clear NFS_INO_INVALID_LABEL when it has successfully set the security label for the inode. Finally it gets rid of the NFS_INO_INVALID_LABEL cruft from nfs_update_inode, which has nothing to do with labeled NFS. Reported-by: Neil Brown <neilb@suse.de> Cc: stable@vger.kernel.org # 3.11+ Tested-by: Neil Brown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
694e096f |
|
12-Nov-2013 |
Anna Schumaker <bjschuma@netapp.com> |
NFS: Enabling v4.2 should not recompile nfsd and lockd When CONFIG_NFS_V4_2 is toggled nfsd and lockd will be recompiled, instead of only the nfs client. This patch moves a small amount of code into the client directory to avoid unnecessary recompiles. Signed-off-by: Anna Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
4d4b69dd |
|
18-Oct-2013 |
Weston Andros Adamson <dros@netapp.com> |
NFS: add support for multiple sec= mount options This patch adds support for multiple security options which can be specified using a colon-delimited list of security flavors (the same syntax as nfsd's exports file). This is useful, for instance, when NFSv4.x mounts cross SECINFO boundaries. With this patch a user can use "sec=krb5i,krb5p" to mount a remote filesystem using krb5i, but can still cross into krb5p-only exports. New mounts will try all security options before failing. NFSv4.x SECINFO results will be compared against the sec= flavors to find the first flavor in both lists or if no match is found will return -EPERM. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
a3f73c27 |
|
18-Oct-2013 |
Weston Andros Adamson <dros@netapp.com> |
NFS: separate passed security flavs from selected When filling parsed_mount_data, store the parsed sec= mount option in the new struct nfs_auth_info and the chosen flavor in selected_flavor. This patch lays the groundwork for supporting multiple sec= options. Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
32e62b7c |
|
17-Oct-2013 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Add nfs4_update_server New function nfs4_update_server() moves an nfs_server to a different nfs_client. This is done as part of migration recovery. Though it may be appealing to think of them as the same thing, migration recovery is not the same as following a referral. For a referral, the client has not descended into the file system yet: it has no nfs_server, no super block, no inodes or open state. It is enough to simply instantiate the nfs_server and super block, and perform a referral mount. For a migration, however, we have all of those things already, and they have to be moved to a different nfs_client. No local namespace changes are needed here. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
1ab6c499 |
|
27-Aug-2013 |
Dave Chinner <dchinner@redhat.com> |
fs: convert fs shrinkers to new scan/count API Convert the filesystem shrinkers to use the new API, and standardise some of the behaviours of the shrinkers at the same time. For example, nr_to_scan means the number of objects to scan, not the number of objects to free. I refactored the CIFS idmap shrinker a little - it really needs to be broken up into a shrinker per tree and keep an item count with the tree root so that we don't need to walk the tree every time the shrinker needs to count the number of objects in the tree (i.e. all the time under memory pressure). [glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree] [assorted fixes folded in] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Glauber Costa <glommer@openvz.org> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Acked-by: Jan Kara <jack@suse.cz> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Rientjes <rientjes@google.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: J. Bruce Fields <bfields@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
5e6b1990 |
|
06-Sep-2013 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4: Fix security auto-negotiation NFSv4 security auto-negotiation has been broken since commit 4580a92d44e2b21c2254fa5fef0f1bfb43c82318 (NFS: Use server-recommended security flavor by default (NFSv3)) because nfs4_try_mount() will automatically select AUTH_SYS if it sees no auth flavours. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Chuck Lever <chuck.lever@oracle.com>
|
#
0e20162e |
|
06-Sep-2013 |
Andy Adamson <andros@netapp.com> |
NFSv4.1 Use MDS auth flavor for data server connection Commit 4edaa308 "NFS: Use "krb5i" to establish NFSv4 state whenever possible" uses the nfs_client cl_rpcclient for all state management operations, and will use krb5i or auth_sys with no regard to the mount command authflavor choice. The MDS, as any NFSv4.1 mount point, uses the nfs_server rpc client for all non-state management operations with a different nfs_server for each fsid encountered traversing the mount point, each with a potentially different auth flavor. pNFS data servers are not mounted in the normal sense as there is no associated nfs_server structure. Data servers can also export multiple fsids, each with a potentially different auth flavor. Data servers need to use the same authflavor as the MDS server rpc client for non-state management operations. Populate a list of rpc clients with the MDS server rpc client auth flavor for the DS to use. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
dc24826b |
|
14-Aug-2013 |
Andy Adamson <andros@netapp.com> |
NFS avoid expired credential keys for buffered writes We must avoid buffering a WRITE that is using a credential key (e.g. a GSS context key) that is about to expire or has expired. We currently will paint ourselves into a corner by returning success to the applciation for such a buffered WRITE, only to discover that we do not have permission when we attempt to flush the WRITE (and potentially associated COMMIT) to disk. Use the RPC layer credential key timeout and expire routines which use a a watermark, gss_key_expire_timeo. We test the key in nfs_file_write. If a WRITE is using a credential with a key that will expire within watermark seconds, flush the inode in nfs_write_end and send only NFS_FILE_SYNC WRITEs by adding nfs_ctx_key_to_expire to nfs_need_sync_write. Note that this results in single page NFS_FILE_SYNC WRITEs. Signed-off-by: Andy Adamson <andros@netapp.com> [Trond: removed a pr_warn_ratelimited() for now] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
1264a2f0 |
|
12-Aug-2013 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: refactor code for calculating the crc32 hash of a filehandle We want to be able to display the crc32 hash of the filehandle in tracepoints. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f8407299 |
|
24-Jul-2013 |
Andy Adamson <andros@netapp.com> |
NFS Remove unused authflavour parameter from init_client Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f1c097be |
|
25-Jun-2013 |
Andy Adamson <andros@netapp.com> |
NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize The GETDEVICEINFO gdia_maxcount represents all of the data being returned within the GETDEVICEINFO4resok structure and includes the XDR overhead. The CREATE_SESSION ca_maxresponsesize is the maximum reply and includes the RPC headers (including security flavor credentials and verifiers). Split out the struct pnfs_device field maxcount which is the gdia_maxcount from the pglen field which is the reply (the total) buffer length. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
459de2ed |
|
05-Jun-2013 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Make callbacks minor version generic I found a few places that hardcode the minor version number rather than making it dependent on the protocol the callback came in over. This patch makes it easier to add new minor versions in the future. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
577b4232 |
|
08-Apr-2013 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Add functionality to allow waiting on all outstanding reads to complete This will later allow NFS locking code to wait for readahead to complete before releasing byte range locks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
322b2b90 |
|
11-Jan-2013 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
Revert "NFS: add nfs_sb_deactive_async to avoid deadlock" This reverts commit 324d003b0cd82151adbaecefef57b73f7959a469. The deadlock turned out to be caused by a workqueue limitation that has now been worked around in the RPC code (see comment in rpc_free_task). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
eed99357 |
|
14-Dec-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Ensure that we always drop inodes that have been marked as stale There is no need to cache stale inodes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
aaea7d2f |
|
12-Dec-2012 |
Yanchuan Nian <ycnian@gmail.com> |
nfs: Remove duplicate function declaration in internal.h Remove duplicate function declaration in internal.h Signed-off-by: Yanchuan Nian <ycnian@gmail.com> [Trond: Added nfs_pageio_init_read, which suffered from the same problem] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
fd0c0953 |
|
01-Nov-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4: Simplify the NFSv4/v4.1 synchronous call switch We shouldn't need to pass the 'cache_reply' parameter if we initialise the sequence_args/sequence_res in the caller. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
76e697ba |
|
26-Nov-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4.1: Move slot table and session struct definitions to nfs4session.h Clean up. Gather NFSv4.1 slot definitions in fs/nfs/nfs4session.h. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
73e39aaa |
|
25-Nov-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4.1: Cleanup move session slot management to fs/nfs/nfs4session.c NFSv4.1 session management is getting complex enough to deserve a separate file. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
324d003b |
|
30-Oct-2012 |
Weston Andros Adamson <dros@netapp.com> |
NFS: add nfs_sb_deactive_async to avoid deadlock Use nfs_sb_deactive_async instead of nfs_sb_deactive when in a workqueue context. This avoids a deadlock where rpc_shutdown_client loops forever in a workqueue kworker context, trying to kill all RPC tasks associated with the client, while one or more of these tasks have already been assigned to the same kworker (and will never run rpc_exit_task). This approach is needed because RPC tasks that have already been assigned to a kworker by queue_work cannot be canceled, as explained in the comment for workqueue.c:insert_wq_barrier. Signed-off-by: Weston Andros Adamson <dros@netapp.com> [Trond: add module_get/put.] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
97a54868 |
|
21-Oct-2012 |
Ben Hutchings <ben@decadent.org.uk> |
nfs: Show original device name verbatim in /proc/*/mount{s,info} Since commit c7f404b ('vfs: new superblock methods to override /proc/*/mount{s,info}'), nfs_path() is used to generate the mounted device name reported back to userland. nfs_path() always generates a trailing slash when the given dentry is the root of an NFS mount, but userland may expect the original device name to be returned verbatim (as it used to be). Make this canonicalisation optional and change the callers accordingly. [jrnieder@gmail.com: use flag instead of bool argument] Reported-and-tested-by: Chris Hiestand <chiestand@salk.edu> Reference: http://bugs.debian.org/669314 Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: <stable@vger.kernel.org> # v2.6.39+ Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
6296556f |
|
25-Sep-2012 |
Peng Tao <bergwolf@gmail.com> |
NFS41: send real write size in layoutget For buffer write, block layout client scan inode mapping to find next hole and use offset-to-hole as layoutget length. Object layout client uses offset-to-isize as layoutget length. For direct write, both block layout and object layout use dreq->bytes_left. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
05f4c350 |
|
14-Sep-2012 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Discover NFSv4 server trunking when mounting "Server trunking" is a fancy named for a multi-homed NFS server. Trunking might occur if a client sends NFS requests for a single workload to multiple network interfaces on the same server. There are some implications for NFSv4 state management that make it useful for a client to know if a single NFSv4 server instance is multi-homed. (Note this is only a consideration for NFSv4, not for legacy versions of NFS, which are stateless). If a client cares about server trunking, no NFSv4 operations can proceed until that client determines who it is talking to. Thus server IP trunking discovery must be done when the client first encounters an unfamiliar server IP address. The nfs_get_client() function walks the nfs_client_list and matches on server IP address. The outcome of that walk tells us immediately if we have an unfamiliar server IP address. It invokes nfs_init_client() in this case. Thus, nfs4_init_client() is a good spot to perform trunking discovery. Discovery requires a client to establish a fresh client ID, so our client will now send SETCLIENTID or EXCHANGE_ID as the first NFS operation after a successful ping, rather than waiting for an application to perform an operation that requires NFSv4 state. The exact process for detecting trunking is different for NFSv4.0 and NFSv4.1, so a minorversion-specific init_client callout method is introduced. CLID_INUSE recovery is important for the trunking discovery process. CLID_INUSE is a sign the server recognizes the client's nfs_client_id4 id string, but the client is using the wrong principal this time for the SETCLIENTID operation. The SETCLIENTID must be retried with a series of different principals until one works, and then the rest of trunking discovery can proceed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
8cb7f74e |
|
14-Sep-2012 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: nfs_parsed_mount_options can use unsigned int fs/nfs/super.c: In function ‘nfs_compare_remount_data’: fs/nfs/super.c:2042:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] fs/nfs/super.c:2043:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] fs/nfs/super.c:2044:20: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] fs/nfs/super.c:2046:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] fs/nfs/super.c:2047:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] fs/nfs/super.c:2048:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] fs/nfs/super.c:2049:21: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] fs/nfs/super.c:2050:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] Seen with gcc (GCC) 4.6.3 20120306 (Red Hat 4.6.3-2). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
d56b4ddf |
|
31-Jul-2012 |
Mel Gorman <mgorman@suse.de> |
nfs: teach the NFS client how to treat PG_swapcache pages Replace all relevant occurences of page->index and page->mapping in the NFS client with the new page_file_index() and page_file_mapping() functions. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Rik van Riel <riel@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: David S. Miller <davem@davemloft.net> Cc: Eric B Munson <emunson@mgebm.net> Cc: Eric Paris <eparis@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Christie <michaelc@cs.wisc.edu> Cc: Neil Brown <neilb@suse.de> Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Xiaotian Feng <dfeng@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
89d77c8f |
|
30-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Convert v4 into a module This patch exports symbols needed by the v4 module. In addition, I also switch over to using IS_ENABLED() to check if CONFIG_NFS_V4 or CONFIG_NFS_V4_MODULE are set. The module (nfs4.ko) will be created in the same directory as nfs.ko and will be automatically loaded the first time you try to mount over NFS v4. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
1c606fb7 |
|
30-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Convert v3 into a module This patch exports symbols and moves over the final structures needed by the v3 module. In addition, I also switch over to using IS_ENABLED() to check if CONFIG_NFS_V3 or CONFIG_NFS_V3_MODULE are set. The module (nfs3.ko) will be created in the same directory as nfs.ko and will be automatically loaded the first time you try to mount over NFS v3. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
19d87ca3 |
|
30-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Split out remaining NFS v4 inode functions Somehow I missed this in my previous patch series, but these functions are only needed by the v4 code and should be moved to a v4-only file. I wasn't exactly sure where I should put these functions, so I moved them into nfs4super.c where I could make them static. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
6a74490d |
|
30-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Pass super operations and xattr handlers in the nfs_subversion I can set all variables in the nfs_fill_super() function, allowing me to remove the nfs4_fill_super() function. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
1179acc6 |
|
30-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Only initialize the ACL client in the v3 case v2 and v4 don't use it, so I create two new nfs_rpc_ops functions to initialize the ACL client only when we are using v3. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
ff9099f2 |
|
30-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Create a try_mount rpc op I'm already looking up the nfs subversion in nfs_fs_mount(), so I have easy access to rpc_ops that used to be difficult to reach. This allows me to set up a different mount path for NFS v2/3 and NFS v4. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
ab7017a3 |
|
30-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Add version registering framework This patch adds in the code to track multiple versions of the NFS protocol. I created default structures for v2, v3 and v4 so that each version can continue to work while I convert them into kernel modules. I also removed the const parameter from the rpc_version array so that I can change it at runtime. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
fbdefd64 |
|
16-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Split out the NFS v4 filesystem types This allows me to move the v4 mounting and unmounting functions out of the generic client and into a file that is only compiled when CONFIG_NFS_V4 is enabled. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
fcf10398 |
|
16-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Split out NFS v4 server creating code These functions are specific to NFS v4 and can be moved to nfs4client.c to keep them out of the generic client. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
428360d7 |
|
16-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Initialize the NFS v4 client from init_nfs_v4() And split these functions out of the generic client into a v4 specific file. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
ce4ef7c0 |
|
16-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Split out NFS v4 file operations This patch moves the NFS v4 file functions into a new file that is only compiled when CONFIG_NFS_V4 is enabled. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
597d9289 |
|
16-Jul-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Split out NFS v2 inode operations This patch moves the NFS v2 file and directory inode functions into files that are only compiled whet CONFIG_NFS_V2 is enabled. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
57208fa7 |
|
20-Jun-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Create an write_pageio_init() function pNFS needs to select a write function based on the layout driver currently in use, so I let each NFS version decide how to best handle initializing writes. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
1abb5088 |
|
20-Jun-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Create an read_pageio_init() function pNFS needs to select a read function based on the layout driver currently in use, so I let each NFS version decide how to best handle initializing reads. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
6663ee7f |
|
20-Jun-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Create an alloc_client rpc_op This gives NFS v4 a way to set up callbacks and sessions without v2 or v3 having to do them as well. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
cdb7eced |
|
20-Jun-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Create a free_client rpc_op NFS v4 needs a way to shut down callbacks and sessions, but v2 and v3 don't. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
1d59d61f |
|
30-May-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Ensure that setattr and getattr wait for O_DIRECT write completion Use the same mechanism as the block devices are using, but move the helper functions from fs/direct-io.c into fs/inode.c to remove the dependency on CONFIG_BLOCK. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Fred Isaman <iisaman@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
4697bd5e |
|
23-May-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4: Fix a race in the net namespace mount notification Since the struct nfs_client gets added to the global nfs_client_list before it is initialised, it is possible that rpc_pipefs_event can end up trying to create idmapper entries on such a thing. The solution is to have the mount notification wait for the initialisation of each nfs_client to complete, and then to skip any entries for which the it failed. Reported-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
|
#
7b38c368 |
|
23-May-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4.1: Fix session initialisation races Session initialisation is not complete until the lease manager has run. We need to ensure that both nfs4_init_session and nfs4_init_ds_session do so, and that they check for any resulting errors in clp->cl_cons_state. Only after this is done, can nfs4_ds_connect check the contents of clp->cl_exchange_flags. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Andy Adamson <andros@netapp.com>
|
#
4bf590e0 |
|
21-May-2012 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Add nfs_client behavior flags "noresvport" and "discrtry" can be passed to nfs_create_rpc_client() by setting flags in the passed-in nfs_client. This change makes it easy to add new flags. Note that these settings are now "sticky" over the lifetime of a struct nfs_client, and may even be copied when an nfs_client is cloned. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
8cab4c39 |
|
21-May-2012 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Refactor nfs_get_client(): initialize nfs_client Clean up: Continue to rationalize the locking in nfs_get_client() by moving the logic that handles the case where a matching server IP address is not found. When we support server trunking detection, client initialization may return a different nfs_client struct than was passed to it. Change the synopsis of the init_client methods to return an nfs_client. The client initialization logic in nfs_get_client() is not much more than a wrapper around ->init_client. It's simpler to keep the little bits of error handling in the version-specific init_client methods. No behavior change is expected. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
a033a091 |
|
27-Apr-2012 |
Andy Adamson <andros@netapp.com> |
NFSv4.1 remove nfs4_reset_write and nfs4_reset_read Replaced by filelayout_reset_write and filelayout_reset_read Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
98fc685a |
|
27-Apr-2012 |
Andy Adamson <andros@netapp.com> |
NFSv4.1 data server timeo and retrans module parameters Set the recovery parameters for data servers. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
9f0ec176 |
|
27-Apr-2012 |
Andy Adamson <andros@netapp.com> |
NFSv4.1 set RPC_TASK_SOFTCONN for filelayout DS RPC calls RPC_TASK_SOFTCONN returns connection errors to the caller which allows the pNFS file layout to quickly try the MDS or perhaps another DS. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
5e7e5a0d |
|
10-May-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Create an NFS v3 stat_to_errno() In theory, NFS v3 can have different error versions than NFS v2. v4 is already using its own nfs4_stat_to_errno() to map error codes, so rather than create something in the generic client for v2 and v3 to share I instead give v3 its own function. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
b72e4f42 |
|
10-May-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Create a single function for text mount data The v2/3 and v4 cases were very similar, with just a few parameters changed. This makes it easy to share code. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
3a1556e8 |
|
27-Apr-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv2/v3: Simulate the change attribute Use the ctime to simulate a change attribute for NFSv2 and NFSv3. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
281cad46 |
|
27-Apr-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Create a submount rpc_op This simplifies the code for v2 and v3 and gives v4 a chance to decide on referrals without needing to modify the generic client. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
2671bfc3 |
|
27-Apr-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Remove secinfo knowledge out of the generic client And also remove the unneeded rpc_op. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
1763da12 |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: rewrite directio write to use async coalesce code This also has the advantage that it allows directio to use pnfs. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f453a54a |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: create nfs_commit_completion_ops Factors out the code that needs to change when directio starts using these code paths. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
ea2cf228 |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: create struct nfs_commit_info It is COMMIT that is handled the most differently between the paged and direct paths. Create a structure that encapsulates everything either path needs to know about the commit state. We could use void to hide some of the layout driver stuff, but Trond suggests pulling it out to ensure type checking, given the huge changes being made, and the fact that it doesn't interfere with other drivers. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
584aa810 |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: rewrite directio read to use async coalesce code This also has the advantage that it allows directio to use pnfs. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
061ae2ed |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: create completion structure to pass into page_init functions Factors out the code that will need to change when directio starts using these code paths. This will allow directio to use the generic pagein and flush routines Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
6c75dc0d |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: merge _full and _partial write rpc_ops Decouple nfs_pgio_header and nfs_write_data, and have (possibly multiple) nfs_write_datas each take a refcount on nfs_pgio_header. For the moment keeps nfs_write_header as a way to preallocate a single nfs_write_data with the nfs_pgio_header. The code doesn't need this, and would be prettier without, but given the amount of churn I am already introducing I didn't want to play with tuning new mempools. This also fixes bug in pnfs_ld_handle_write_error. In the case of desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing replay attempt to do nothing. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
4db6e0b7 |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: merge _full and _partial read rpc_ops Decouple nfs_pgio_header and nfs_read_data, and have (possibly multiple) nfs_read_datas each take a refcount on nfs_pgio_header. For the moment keeps nfs_read_header as a way to preallocate a single nfs_read_data with the nfs_pgio_header. The code doesn't need this, and would be prettier without, but given the amount of churn I am already introducing I didn't want to play with tuning new mempools. This also fixes bug in pnfs_ld_handle_read_error. In the case of desc->pg_bsize < PAGE_CACHE_SIZE, the pages list was empty, causing replay attempt to do nothing. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
30dd374f |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: create struct nfs_page_array Both nfs_read_data and nfs_write_data devote several fields which can be combined into a single shared struct. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
cd841605 |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: create common nfs_pgio_header for both read and write In order to avoid duplicating all the data in nfs_read_data whenever we split it up into multiple RPC calls (either due to a short read result or due to rsize < PAGE_SIZE), we split out the bits that are the same per RPC call into a separate "header" structure. The goal this patch moves towards is to have a single header refcounted by several rpc_data structures. Thus, want to always refer from rpc_data to the header, and not the other way. This patch comes close to that ideal, but the directio code currently needs some special casing, isolated in the nfs_direct_[read_write]hdr_release() functions. This will be dealt with in a future patch. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
c5996c4e |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: reverse arg order in nfs_initiate_[read|write] Make it consistent with nfs_initiate_commit. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
0b7c0153 |
|
20-Apr-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: add a struct nfs_commit_data to replace nfs_write_data in commits Commits don't need the vectors of pages, etc. that writes do. Split out a separate structure for the commit operation. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
7e6eb683 |
|
27-Apr-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Honor the authflavor set in the clone mount data The authflavor is set in an nfs_clone_mount structure and passed to the xdev_mount() functions where it was promptly ignored. Instead, use it to initialize an rpc_clnt for the cloned server. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f05d147f |
|
27-Apr-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Fix following referral mount points with different security I create a new proc_lookup_mountpoint() to use when submounting an NFS v4 share. This function returns an rpc_clnt to use for performing an fs_locations() call on a referral's mountpoint. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
72de53ec |
|
27-Apr-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Do secinfo as part of lookup Whenever lookup sees wrongsec do a secinfo and retry the lookup to find attributes of the file or directory, such as "is this a referral mountpoint?". This also allows me to remove handling -NFS4ERR_WRONSEC as part of getattr xdr decoding. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
8dd37758 |
|
15-Mar-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit code Move more pnfs-isms out of the generic commit code. Bugfixes: - filelayout_scan_commit_lists doesn't need to get/put the lseg. In fact since it is run under the inode->i_lock, the lseg_put() can deadlock. - Ensure that we distinguish between what needs to be done for commit-to-data server and what needs to be done for commit-to-MDS using the new flag PG_COMMIT_TO_DS. Otherwise we may end up calling put_lseg() on a bucket for a struct nfs_page that got written through the MDS. - Fix a case where we were using list_del() on an nfs_page->wb_list instead of list_del_init(). - filelayout_initiate_commit needs to call filelayout_commit_release on error instead of the mds_ops->rpc_release(). Otherwise it won't clear the commit lock. Cleanups: - Let the files layout manage the commit lists for the pNFS case. Don't expose stuff like pnfs_choose_commit_list, and the fact that the commit buckets hold references to the layout segment in common code. - Cast out the put_lseg() calls for the struct nfs_read/write_data->lseg into the pNFS layer from whence they came. - Let the pNFS layer manage the NFS_INO_PNFS_COMMIT bit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
|
#
d6d6dc7c |
|
08-Mar-2012 |
Fred Isaman <iisaman@netapp.com> |
NFS: remove nfs_inode radix tree The radix tree is only being used to compile lists of reqs needing commit. It is simpler to just put the reqs directly into a list. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
c7add9a9 |
|
26-Jan-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
NFS: search for client session id in proper network namespace Network namespace is taken from request transport and passed as a part of cb_process_state structure. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
dc030858 |
|
23-Jan-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
NFS: make nfs_client_lock per net ns This patch makes nfs_clients_lock allocated per network namespace. All items it protects are already network namespace aware. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
28cd1b3f |
|
23-Jan-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
NFS: make cb_ident_idr per net ns This patch makes ID's infrastructure network namespace aware. This was done mainly because of nfs_client_lock, which is desired to be per network namespace, but protects NFS clients ID's. NOTE: NFS client's net pointer have to be set prior to ID initialization, proper assignment was moved. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
6b13168b |
|
23-Jan-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
NFS: make nfs_client_list per net ns This patch splits global list of NFS clients into per-net-ns array of lists. This looks more strict and clearer. BTW, this patch also makes "/proc/fs/nfsfs/servers" entry content depends on /proc mount owner pid namespace. See below for details. NOTE: few words about how was /proc/fs/nfsfs/ entries content show per network namespace done. This is a little bit tricky and not the best is could be. But it's cheap (proper fix for /proc conteinerization is a hard nut to crack). The idea is simple: take proper network namespace from pid namespace child reaper nsproxy of /proc/ mount creator. This actually means, that if there are 2 containers with different net namespace sharing pid namespace, then read of /proc/fs/nfsfs/ entries will always return content, taken from net namespace of pid namespace creator task (and thus second namespace set wil be unvisible). Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
a613fa16 |
|
20-Jan-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
SUNRPC: constify the rpc_program Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
babea479 |
|
20-Jan-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
NFS: remove unused nfs4_find_client_no_ident function Looks like this function survived after some cleanup patch without a reason. Now it's not called or referenced and I believe, that it can be simply removed. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
eee17325 |
|
10-Jan-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
NFS: idmap PipeFS notifier introduced v2: 1) Added "nfs_idmap_init" and "nfs_idmap_quit" definitions for kernels built without CONFIG_NFS_V4 option set. This patch subscribes NFS clients to RPC pipefs notifications. Idmap notifier is registering on NFS module load. This notifier callback is responsible for creation/destruction of PipeFS idmap pipe dentry for NFS4 clients. Since ipdmap pipe is created in rpc client pipefs directory, we have make sure, that this directory has been created already. IOW RPC client notifier callback has been called already. To achive this, PipeFS notifier priorities has been introduced (RPC clients notifier priority is greater than NFS idmap one). But this approach gives another problem: unlink for RPC client directory will be called before NFS idmap pipe unlink on UMOUNT event and will fail, because directory is not empty. The solution, introduced in this patch, is to try to remove client directory once again after idmap pipe was unlinked. This looks like ugly hack, so probably it should be replaced in some more elegant way. Note that no locking required in notifier callback because PipeFS superblock pointer is passed as an argument from it's creation or destruction routine and thus we can be sure about it's validity. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
6d59b8d5 |
|
10-Jan-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
NFS: pass NFS client owner network namespace to RPC client creation routine This patch replaces static "init_net" with nfs_client->net pointer in RPC client creation calls. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
e50a7a1a |
|
10-Jan-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
NFS: make NFS client allocated per network namespace context This patch adds new net variable to nfs_client structure. This variable is set on NFS client creation and cheched during matching NFS client search. Initially current->nsproxy->net_ns is used as network namespace owner for new NFS client to create. This network namespace pointer is set during mount options parsing and thus can be passed from user-spave utils in future if will be necessary. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
a6bc32b8 |
|
12-Jan-2012 |
Mel Gorman <mgorman@suse.de> |
mm: compaction: introduce sync-light migration for use by compaction This patch adds a lightweight sync migrate operation MIGRATE_SYNC_LIGHT mode that avoids writing back pages to backing storage. Async compaction maps to MIGRATE_ASYNC while sync compaction maps to MIGRATE_SYNC_LIGHT. For other migrate_pages users such as memory hotplug, MIGRATE_SYNC is used. This avoids sync compaction stalling for an excessive length of time, particularly when copying files to a USB stick where there might be a large number of dirty pages backed by a filesystem that does not support ->writepages. [aarcange@redhat.com: This patch is heavily based on Andrea's work] [akpm@linux-foundation.org: fix fs/nfs/write.c build] [akpm@linux-foundation.org: fix fs/btrfs/disk-io.c build] Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andy Isaacson <adi@hexapodia.org> Cc: Nai Xia <nai.xia@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
b969c4ab |
|
12-Jan-2012 |
Mel Gorman <mgorman@suse.de> |
mm: compaction: determine if dirty pages can be migrated without blocking within ->migratepage Asynchronous compaction is used when allocating transparent hugepages to avoid blocking for long periods of time. Due to reports of stalling, there was a debate on disabling synchronous compaction but this severely impacted allocation success rates. Part of the reason was that many dirty pages are skipped in asynchronous compaction by the following check; if (PageDirty(page) && !sync && mapping->a_ops->migratepage != migrate_page) rc = -EBUSY; This skips over all mapping aops using buffer_migrate_page() even though it is possible to migrate some of these pages without blocking. This patch updates the ->migratepage callback with a "sync" parameter. It is the responsibility of the callback to fail gracefully if migration would block. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Dave Jones <davej@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Andy Isaacson <adi@hexapodia.org> Cc: Nai Xia <nai.xia@gmail.com> Cc: Johannes Weiner <jweiner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
e2fecb21 |
|
06-Jan-2012 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Remove pNFS bloat from the generic write path We have no business doing any this in the standard write release path. Get rid of it, and put it in the pNFS layer. Also, while we're at it, get rid of the completely bogus unlock/relock semantics that were present in nfs_writeback_release_full(). It is not only unnecessary, but actually dangerous to release the write lock just in order to take it again in nfs_page_async_flush(). Better just to open code the pgio operations in a pnfs helper. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
62e4a769 |
|
10-Nov-2011 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Revert pnfs ugliness from the generic NFS read code path pNFS-specific code belongs in the pnfs layer. It should not be hijacking generic NFS read or write code paths. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
d00c5d43 |
|
19-Oct-2011 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Get rid of nfs_restart_rpc() It can trivially be replaced with rpc_restart_call_prepare. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
1f945357 |
|
13-Jul-2011 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Clean up - simplify the switch to read/write-through-MDS Use nfs_pageio_reset_read_mds and nfs_pageio_reset_write_mds instead of completely reinitialising the struct nfs_pageio_descriptor. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
dce81290 |
|
13-Jul-2011 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Move the pnfs write code into pnfs.c ...and ensure that we recoalese to take into account differences in differences in block sizes when falling back to write through the MDS. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
493292dd |
|
13-Jul-2011 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Move the pnfs read code into pnfs.c ...and ensure that we recoalese to take into account differences in block sizes when falling back to read through the MDS. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
e885de1a |
|
10-Jun-2011 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4.1: Fall back to ordinary i/o through the mds if we have no layout segment Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
fca78d6d |
|
02-Jun-2011 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Add SECINFO_NO_NAME procedure If the client is using NFS v4.1, then we can use SECINFO_NO_NAME to find the secflavor for the initial mount. If the server doesn't support SECINFO_NO_NAME then I fall back on the "guess and check" method used for v4.0 mounts. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
533eb461 |
|
13-Jun-2011 |
Andy Adamson <andros@netapp.com> |
NFSv4.1: allow nfs_fhget to succeed with mounted on fileid Commit 28331a46d88459788c8fca72dbb0415cd7f514c9 "Ensure we request the ordinary fileid when doing readdirplus" changed the meaning of NFS_ATTR_FATTR_FILEID which used to be set when FATTR4_WORD1_MOUNTED_ON_FILED was requested. Allow nfs_fhget to succeed with only a mounted on fileid when crossing a mountpoint or a referral. Ask for the fileid of the absent file system if mounted_on_fileid is not supported. Signed-off-by: Andy Adamson <andros@netapp.com> cc:stable@kernel.org [2.6.39] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
d20581aa |
|
22-May-2011 |
Benny Halevy <bhalevy@panasas.com> |
pnfs: support for non-rpc layout drivers Non-rpc layout driver such as for objects and blocks implement their own I/O path and error handling logic. Therefore bypass NFS-based error handling for these layout drivers. [fix lseg ref-count bugs, and null de-refs] [Fall out from: non-rpc layout drivers] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [get rid of PNFS_USE_RPC_CODE] [get rid of __nfs4_write_done_cb] [revert useless change in nfs4_write_done_cb] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
|
#
1495f230 |
|
24-May-2011 |
Ying Han <yinghan@google.com> |
vmscan: change shrinker API by passing shrink_control struct Change each shrinker's API by consolidating the existing parameters into shrink_control struct. This will simplify any further features added w/o touching each file of shrinker. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: fix warning] [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API] [akpm@linux-foundation.org: fix xfs warning] [akpm@linux-foundation.org: update gfs2] Signed-off-by: Ying Han <yinghan@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
#
7ebb9315 |
|
24-Mar-2011 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: use secinfo when crossing mountpoints A submount may use different security than the parent mount does. We should figure out what sec flavor the submount uses at mount time. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
7c513058 |
|
24-Mar-2011 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: lookup supports alternate client A later patch will need to perform a lookup using an alternate client with a different security flavor. This patch adds support for doing that on NFS v4. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
e0c2b380 |
|
23-Mar-2011 |
Fred Isaman <iisaman@netapp.com> |
NFSv4.1: filelayout driver specific code for COMMIT Implement all the hooks created in the previous patches. This requires exporting quite a few functions and adding a few structure fields. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f8ad9c4b |
|
16-Mar-2011 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: nfs_do_{ref,sub}mount() superblock argument is redundant It's always equal to dentry->d_sb Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
b514f872 |
|
16-Mar-2011 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: make nfs_path() work without vfsmount part 3: now we have everything to get nfs_path() just by dentry - just follow to (disconnected) root and pick the rest of the thing there. Start killing propagation of struct vfsmount * on the paths that used to bring it to nfs_path(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
0d5839ad |
|
16-Mar-2011 |
Al Viro <viro@zeniv.linux.org.uk> |
nfs: propagate devname to nfs{,4}_get_root() step 1 of ->mnt_devname fixes: make sure we have the value of devname available in ..._get_root(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
a69aef14 |
|
03-Mar-2011 |
Fred Isaman <iisaman@netapp.com> |
NFSv4.1: pnfs filelayout driver write Allows the pnfs filelayout driver to write to the data servers. Note that COMMIT to data servers will be implemented in a future patch. To avoid improper behavior, for the moment any WRITE to a data server that would also require a COMMIT to the data server is sent NFS_FILE_SYNC. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn> Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
cbdabc7f |
|
28-Feb-2011 |
Andy Adamson <andros@netapp.com> |
NFSv4.1: filelayout async error handler Use our own async error handler. Mark the layout as failed and retry i/o through the MDS on specified errors. Update the mds_offset in nfs_readpage_retry so that a failed short-read retry to a DS gets correctly resent through the MDS. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
dc70d7b3 |
|
28-Feb-2011 |
Andy Adamson <andros@netapp.com> |
NFSv4.1: filelayout read Attempt a pNFS file layout read by setting up the nfs_read_data struct and calling nfs_initiate_read with the data server rpc client and the filelayout rpc call ops. Error handling is implemented in a subsequent patch. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn> Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Ricardo Labiaga <Ricardo.Labiaga@netapp.com> Tested-by: Guo Mingyang <guomingyang@nrchpc.ac.cn> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
d83217c1 |
|
28-Feb-2011 |
Andy Adamson <andros@netapp.com> |
NFSv4.1: data server connection Introduce a data server set_client and init session following the nfs4_set_client and nfs4_init_session convention. Once a new nfs_client is on the nfs_client_list, the nfs_client cl_cons_state serializes access to creating an nfs_client struct with matching properties. Use the new nfs_get_client() that initializes new clients. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
45a52a02 |
|
28-Feb-2011 |
Andy Adamson <andros@netapp.com> |
NFS move nfs_client initialization into nfs_get_client Now nfs_get_client returns an nfs_client ready to be used no matter if it was found or created. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
778be232 |
|
25-Jan-2011 |
Andy Adamson <andros@netapp.com> |
NFS do not find client in NFSv4 pg_authenticate The information required to find the nfs_client cooresponding to the incoming back channel request is contained in the NFS layer. Perform minimal checking in the RPC layer pg_authenticate method, and push more detailed checking into the NFS layer where the nfs_client can be found. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
36d43a43 |
|
14-Jan-2011 |
David Howells <dhowells@redhat.com> |
NFS: Use d_automount() rather than abusing follow_link() Make NFS use the new d_automount() dentry operation rather than abusing follow_link() on directories. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
c36fca52 |
|
05-Jan-2011 |
Andy Adamson <andros@netapp.com> |
NFS refactor nfs_find_client and reference client across callback processing Fixes a bug where the nfs_client could be freed during callback processing. Refactor nfs_find_client to use minorversion specific means to locate the correct nfs_client structure. In the NFS layer, V4.0 clients are found using the callback_ident field in the CB_COMPOUND header. V4.1 clients are found using the sessionID in the CB_SEQUENCE operation which is also compared against the sessionID associated with the back channel thread after a successful CREATE_SESSION. Each of these methods finds the one an only nfs_client associated with the incoming callback request - so nfs_find_client_next is not needed. In the RPC layer, the pg_authenticate call needs to find the nfs_client. For the v4.0 callback service, the callback identifier has not been decoded so a search by address, version, and minorversion is used. The sessionid for the sessions based callback service has (usually) not been set for the pg_authenticate on a CB_NULL call which can be sent prior to the return of a CREATE_SESSION call, so the sessionid associated with the back channel thread is not used to find the client in pg_authenticate for CB_NULL calls. Pass the referenced nfs_client to each CB_COMPOUND operation being proceesed via the new cb_process_state structure. The reference is held across cb_compound processing. Use the new cb_process_state struct to move the NFS4ERR_RETRY_UNCACHED_REP processing from process_op into nfs4_callback_sequence where it belongs. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f4eecd5d |
|
05-Jan-2011 |
Andy Adamson <andros@netapp.com> |
NFS implement v4.0 callback_ident Use the small id to pointer translator service to provide a unique callback identifier per SETCLIENTID call used to identify the v4.0 callback service associated with the clientid. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
573c4e1e |
|
14-Dec-2010 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Simplify ->decode_dirent() calling sequence Clean up. The pointer returned by ->decode_dirent() is no longer used as a pointer. The only call site (xdr_decode() in fs/nfs/dir.c) simply extracts the errno value encoded in the pointer. Replace the returned pointer with a standard integer errno return value. Also, pass the "server" argument as part of the nfs_entry instead of as a separate parameter. It's faster to derive "server" in nfs_readdir_xdr_to_array() since we already have the directory's inode handy. "server" ought to be invariant for a set of entries in the same directory, right? The legacy versions of decode_dirent() don't use "server" anyway, so it's wasted work for them to derive and pass "server" for each entry. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f796f8b3 |
|
14-Dec-2010 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Introduce new-style XDR decoding functions for NFSv2 We'd like to prevent local buffer overflows caused by malicious or broken servers. New xdr_stream style decoders can do that. For efficiency, we also eventually want to be able to pass xdr_streams from call_decode() to all XDR decoding functions, rather than building an xdr_stream in every XDR decoding function in the kernel. nfs_decode_dirent() is renamed to follow the naming convention of the other two dirent decoders. Static helper functions are left without the "inline" directive. This allows the compiler to choose automatically how to optimize these for size or speed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
85828493 |
|
14-Dec-2010 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Use the "nfs_stat" enum for nfs_stat_to_errno()'s argument Clean up. To distinguish more clearly between the on-the-wire NFSERR_ value and our local errno values, use the proper type for the argument of nfs_stat_to_errno(). Add a documenting comment appropriate for a global function shared outside this source file. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
0b26a0bf |
|
20-Nov-2010 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Ensure we return the dirent->d_type when it is known Store the dirent->d_type in the struct nfs_cache_array_entry so that we can use it in getdents() calls. This fixes a regression with the new readdir code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
82f2e547 |
|
21-Oct-2010 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: Readdir plus in v4 By requsting more attributes during a readdir, we can mimic the readdir plus operation that was in NFSv3. To test, I ran the command `ls -lU --color=none` on directories with various numbers of files. Without readdir plus, I see this: n files | 100 | 1,000 | 10,000 | 100,000 | 1,000,000 --------+-----------+-----------+-----------+-----------+---------- real | 0m00.153s | 0m00.589s | 0m05.601s | 0m56.691s | 9m59.128s user | 0m00.007s | 0m00.007s | 0m00.077s | 0m00.703s | 0m06.800s sys | 0m00.010s | 0m00.070s | 0m00.633s | 0m06.423s | 1m10.005s access | 3 | 1 | 1 | 4 | 31 getattr | 2 | 1 | 1 | 1 | 1 lookup | 104 | 1,003 | 10,003 | 100,003 | 1,000,003 readdir | 2 | 16 | 158 | 1,575 | 15,749 total | 111 | 1,021 | 10,163 | 101,583 | 1,015,784 With readdir plus enabled, I see this: n files | 100 | 1,000 | 10,000 | 100,000 | 1,000,000 --------+-----------+-----------+-----------+-----------+---------- real | 0m00.115s | 0m00.206s | 0m01.079s | 0m12.521s | 2m07.528s user | 0m00.003s | 0m00.003s | 0m00.040s | 0m00.290s | 0m03.296s sys | 0m00.007s | 0m00.020s | 0m00.120s | 0m01.357s | 0m17.556s access | 3 | 1 | 1 | 1 | 7 getattr | 2 | 1 | 1 | 1 | 1 lookup | 4 | 3 | 3 | 3 | 3 readdir | 6 | 62 | 630 | 6,300 | 62,993 total | 15 | 67 | 635 | 6,305 | 63,004 Readdir plus disabled has about a 16x increase in the number of rpc calls and is 4 - 5 times slower on large directories. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
56e4ebf8 |
|
20-Oct-2010 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: readdir with vmapped pages We can use vmapped pages to read more information from the network at once. This will reduce the number of calls needed to complete a readdir. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> [trondmy: Added #include for linux/vmalloc.h> in fs/nfs/dir.c] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
babddc72 |
|
20-Oct-2010 |
Bryan Schumaker <bjschuma@netapp.com> |
NFS: decode_dirent should use an xdr_stream Convert nfs*xdr.c to use an xdr stream in decode_dirent. This will prevent a kernel oops that has been occuring when reading a vmapped page. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
b57922d9 |
|
07-Jun-2010 |
Al Viro <viro@zeniv.linux.org.uk> |
convert remaining ->clear_inode() to ->evict_inode() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
d05dd4e9 |
|
31-Jul-2010 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Fix the NFS users of rpc_restart_call() Fix up those functions that depend on knowing whether or not rpc_restart_call is successful or not. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
7f8275d0 |
|
18-Jul-2010 |
Dave Chinner <dchinner@redhat.com> |
mm: add context argument to shrinker callback The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
815409d2 |
|
16-Apr-2010 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4: Eliminate nfs4_path_walk() All we really want is the ability to retrieve the root file handle. We no longer need the ability to walk down the path, since that is now done in nfs_follow_remote_path(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
a9185b41 |
|
05-Mar-2010 |
Christoph Hellwig <hch@lst.de> |
pass writeback_control to ->write_inode This gives the filesystem more information about the writeback that is happening. Trond requested this for the NFS unstable write handling, and other filesystems might benefit from this too by beeing able to distinguish between the different callers in more detail. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
0110ee15 |
|
07-Dec-2009 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Fix up the declaration of nfs4_restart_rpc when NFSv4 not configured Also rename it: it is used in generic code, and so should not have a 'nfs4' prefix. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
d61e612a |
|
05-Dec-2009 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv41: Clean up slot table management We no longer need to maintain a distinction between nfs41_sequence_done and nfs41_sequence_free_slot. This fixes a number of slot table leakages in the NFSv4.1 code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
2449ea2e |
|
05-Dec-2009 |
Alexandros Batsakis <batsakis@netapp.com> |
nfs41: V2 adjust max_rqst_sz, max_resp_sz w.r.t to rsize, wsize The v4.1 client should take into account the desired rsize, wsize when negotiating the max size in CREATE_SESSION. Accordingly, it should use rsize, wsize that are smaller than the session negotiated values. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
d8cb1a7c |
|
05-Dec-2009 |
Alexandros Batsakis <batsakis@netapp.com> |
nfs41: check if session exists and if it is persistent Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
07bccc2d |
|
05-Dec-2009 |
Alexandros Batsakis <batsakis@netapp.com> |
nfs41: add support for callback with RPC version number 4 The NFSv4.1 spec-29 (18.36.3) says that the server MUST use an ONC RPC (program) version number equal to 4 in callbacks sent to the client. For now we allow both versions 1 and 4. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
e608e79f |
|
04-Dec-2009 |
Andy Adamson <andros@netapp.com> |
nfs41: call free slot from nfs4_restart_rpc nfs41_sequence_free_slot can be called multiple times on SEQUENCE operation errors. No reason to inline nfs4_restart_rpc Reported-by: Trond Myklebust <trond.myklebust@netapp.com> nfs_writeback_done and nfs_readpage_retry call nfs4_restart_rpc outside the error handler, and the slot is not freed prior to restarting in the rpc_prepare state during session reset. Fix this by moving the call to nfs41_sequence_free_slot from the error path of nfs41_sequence_done into nfs4_restart_rpc, and by removing the test for NFS4CLNT_SESSION_SETUP. Always free slot and goto the rpc prepare state on async errors. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
6df08189 |
|
04-Dec-2009 |
Andy Adamson <andros@netapp.com> |
nfs41: rename cl_state session SETUP bit to RESET The bit is no longer used for session setup, only for session reset. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
764302cc |
|
08-Sep-2009 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Allow the "nfs" file system type to support NFSv4 When mounting an "nfs" type file system, recognize "v4," "vers=4," or "nfsvers=4" mount options, and convert the file system to "nfs4" under the covers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [trondmy: fixed up binary mount code so it sets the 'version' field too] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
4cfd74fc |
|
08-Sep-2009 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Mount option parser should detect missing "port=" The meaning of not specifying the "port=" mount option is different for "-t nfs" and "-t nfs4" mounts. The default port value for NFSv2/v3 mounts is 0, but the default for NFSv4 mounts is 2049. To support "-t nfs -o vers=4", the mount option parser must detect when "port=" is missing so that the correct default port value can be set depending on which NFS version is requested. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
074cc1de |
|
10-Aug-2009 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Add a ->migratepage() aop for NFS Make NFS a bit more friendly to NUMA and memory hot removal... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
ec6ee612 |
|
09-Aug-2009 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Replace nfs_set_port() with rpc_set_port() Clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
53a0b9c4 |
|
09-Aug-2009 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Replace nfs_parse_ip_address() with rpc_pton() Clean up: Use the common routine now provided in sunrpc.ko for parsing mount addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
a02d6926 |
|
09-Aug-2009 |
Chuck Lever <chuck.lever@oracle.com> |
SUNRPC: Provide functions for managing universal addresses Introduce a set of functions in the kernel's RPC implementation for converting between a socket address and either a standard presentation address string or an RPC universal address. The universal address functions will be used to encode and decode RPCB_FOO and NFSv4 SETCLIENTID arguments. The other functions are part of a previous promise to deliver shared functions that can be used by upper-layer protocols to display and manipulate IP addresses. The kernel's current address printf formatters were designed specifically for kernel to user-space APIs that require a particular string format for socket addresses, thus are somewhat limited for the purposes of sunrpc.ko. The formatter for IPv6 addresses, %pI6, does not support short-handing or scope IDs. Also, these printf formatters are unique per address family, so a separate formatter string is required for printing AF_INET and AF_INET6 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
0b524123 |
|
09-Aug-2009 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Add ability to send MOUNTPROC_UMNT to the kernel's mountd client After certain failure modes of an NFS mount, an NFS client should send a MOUNTPROC_UMNT request to remove the just-added mount entry from the server's mount table. While no-one should rely on the accuracy of the server's mount table, sending a UMNT is simply being a good internet neighbor. Since NFS mount processing is handled in the kernel now, we will need a function in the kernel's mountd client that can post a MOUNTRPC_UMNT request, in order to handle these failure modes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
8e02f6b9a |
|
17-Jun-2009 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Update MNT and MNT3 reply decoding functions Solder xdr_stream-based XDR decoding functions into the in-kernel mountd client that are more careful about checking data types and watching for buffer overflows. The new MNT3 decoder includes support for auth-flavor list decoding. The "_sz" macro for MNT3 replies was missing the size of the file handle. I've added this back, and included the size of the auth flavor array. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
a14017db |
|
17-Jun-2009 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: add XDR decoder for mountd version 3 auth-flavor lists Introduce an xdr_stream-based XDR decoder that can unpack the auth- flavor list returned in a MNT3 reply. The nfs_mount() function's caller allocates an array, and passes the size and a pointer to it. The decoder decodes all the flavors it can into the array, and returns the number of decoded flavors. If the caller is not interested in the auth flavors, it can pass a value of zero as the size of the pre-allocated array. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
008f55d0 |
|
01-Apr-2009 |
Benny Halevy <bhalevy@panasas.com> |
nfs41: recover lease in _nfs4_lookup_root This creates the nfsv4.1 session on mount. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
eedc020e |
|
01-Apr-2009 |
Andy Adamson <andros@netapp.com> |
nfs41: use rpc prepare call state for session reset [nfs41: change nfs4_restart_rpc argument] [nfs41: check for session not minorversion] [nfs41: trigger the state manager for session reset] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [always define nfs4_restart_rpc] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
76db6d95 |
|
01-Apr-2009 |
Andy Adamson <andros@netapp.com> |
nfs41: add session setup to the state manager At mount, nfs_alloc_client sets the cl_state NFS4CLNT_LEASE_EXPIRED bit and nfs4_alloc_session sets the NFS4CLNT_SESSION_SETUP bit, so both bits are set when nfs4_lookup_root calls nfs4_recover_expired_lease which schedules the nfs4_state_manager and waits for it to complete. Place the session setup after the clientid establishment in nfs4_state_manager so that the session is setup right after the clientid has been established without rescheduling the state manager. Unlike nfsv4.0, the nfs_client struct is not ready to use until the session has been established. Postpone marking the nfs_client struct to NFS_CS_READY until after a successful CREATE_SESSION call so that other threads cannot use the client until the session is established. If the EXCHANGE_ID call fails and the session has not been setup (the NFS4CLNT_SESSION_SETUP bit is set), mark the client with the error and return. If the session setup CREATE_SESSION call fails with NFS4ERR_STALE_CLIENTID which could occur due to server reboot or network partition inbetween the EXCHANGE_ID and CREATE_SESSION call, reset the NFS4CLNT_LEASE_EXPIRED and NFS4CLNT_SESSION_SETUP bits and try again. If the CREATE_SESSION call fails with other errors, mark the client with the error and return. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: NFS_CS_SESSION_SETUP cl_cons_state for back channel setup] On session setup, the CREATE_SESSION reply races with the server back channel probe which needs to succeed to setup the back channel. Set a new cl_cons_state NFS_CS_SESSION_SETUP just prior to the CREATE_SESSION call and add it as a valid state to nfs_find_client so that the client back channel can find the nfs_client struct and won't drop the server backchannel probe. Use a new cl_cons_state so that NFSv4.0 back channel behaviour which only sets NFS_CS_READY is unchanged. Adjust waiting on the nfs_client_active_wq accordingly. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: rename NFS_CS_SESSION_SETUP to NFS_CS_SESSION_INITING] Signed-off-by: Andy Adamson <andros@netapp.com> [nfs41: set NFS_CL_SESSION_INITING in alloc_session] Signed-off-by: Andy Adamson <andros@netapp.com> [nfs41: move session setup into a function] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [moved nfs4_proc_create_session declaration here] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
def6ed7e |
|
01-Apr-2009 |
Andy Adamson <andros@netapp.com> |
nfs41 write sequence setup done support Separate write calls from nfs41: sequence setup/done support Implement the write rpc_call_prepare method for asynchronuos nfs rpcs, call nfs41_setup_sequence from respective rpc_call_validate_args methods. Call nfs4_sequence_done from respective rpc_call_done methods. Note that we need to pass a pointer to the nfs_server in calls data for passing on to nfs4_sequence_done. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfs: client data server write validate and release] Signed-off-by: Andy Adamson <andros@umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [move the nfs4_sequence_free_slot call in nfs_readpage_retry from] [nfs41: separate free slot from sequence done Signed-off-by: Andy Adamson <andros@umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: Support sessions with O_DIRECT.] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs4_sequence_free_slot use nfs_client for data server] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f11c88af |
|
01-Apr-2009 |
Andy Adamson <andros@netapp.com> |
nfs41: read sequence setup/done support Implement the read rpc_call_prepare method for asynchronuos nfs rpcs, call nfs41_setup_sequence from respective rpc_call_validate_args methods. Call nfs4_sequence_done from respective rpc_call_done methods. Note that we need to pass a pointer to the nfs_server in calls data for passing on to nfs4_sequence_done. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfs: client data server write validate and release] Signed-off-by: Andy Adamson <andros@umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [move the nfs4_sequence_free_slot call in nfs_readpage_retry from] [nfs41: separate free slot from sequence done] [remove nfs_readargs.nfs_server, use calldata->inode instead] Signed-off-by: Andy Adamson <andros@umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: Support sessions with O_DIRECT] Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs4_sequence_free_slot use nfs_client for data server] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
13615871 |
|
01-Apr-2009 |
Andy Adamson <andros@netapp.com> |
nfs41: nfs41_sequence_free_slot [from nfs41: separate free slot from sequence done] Don't free the slot until after all rpc_restart_calls have completed. Session reset will require more work. As noted by Trond, since we're using rpc_wake_up_next rather than rpc_wake_up() we must always wake up the next task in the queue either by going through nfs4_free_slot, or just calling rpc_wake_up_next if no slot is to be freed. [nfs41: sequence res use slotid] [nfs41: remove SEQ4_STATUS_USE_TK_STATUS] [got rid of nfs4_sequence_res.sr_session, use nfs_client.cl_session instead] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: rpc_wake_up_next if sessions slot was not consumed.] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs4_sequence_free_slot use nfs_client for data server] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
cccef3b9 |
|
01-Apr-2009 |
Andy Adamson <andros@netapp.com> |
nfs41: introduce nfs4_call_sync Use nfs4_call_sync rather than rpc_call_sync to provide for a nfs41 sessions-enabled interface for sessions manipulation. The nfs41 rpc logic uses the rpc_call_prepare method to recover and create the session, as well as selecting a free slot id and the rpc_call_done to free the slot and update slot table related metadata. In the coming patches we'll add rpc prepare and done routines for setting up the sequence op and processing the sequence result. Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: nfs4_call_sync] As per 11-14-08 review. Squash into "nfs41: introduce nfs4_call_sync" and "nfs41: nfs4_setup_sequence" Define two functions one for v4 and one for v41 add a pointer to struct nfs4_client to the correct one. Signed-off-by: Andy Adamson <andros@netapp.com> [added BUG() in _nfs4_call_sync_session if !CONFIG_NFS_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: check for session not minorversion] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [group minorversion specific stuff together] Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Andy Adamson <andros@netapp.com> [nfs41: fixup nfs4_clear_client_minor_version] [introduce nfs4_init_client_minor_version() in this patch] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [cleaned-up patch: got rid of nfs_call_sync_t, dprintks, cosmetics, extra server defs] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
557134a3 |
|
01-Apr-2009 |
Andy Adamson <andros@netapp.com> |
nfs41: sessions client infrastructure NFSv4.1 Sessions basic data types, initialization, and destruction. The session is always associated with a struct nfs_client that holds the exchange_id results. Signed-off-by: Rahul Iyer <iyer@netapp.com> Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [remove extraneous rpc_clnt pointer, use the struct nfs_client cl_rpcclient. remove the rpc_clnt parameter from nfs4 nfs4_init_session] Signed-off-by: Andy Adamson<andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Use the presence of a session to determine behaviour instead of the minorversion number.] Signed-off-by: Andy Adamson <andros@netapp.com> [constified nfs4_has_session's struct nfs_client parameter] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Rename nfs4_put_session() to nfs4_destroy_session() and call it from nfs4_free_client() not nfs4_free_server(). Also get rid of nfs4_get_session() and the ref_count in nfs4_session struct as keeping track of nfs_client should be sufficient] Signed-off-by: Alexandros Batsakis <Alexandros.Batsakis@netapp.com> [nfs41: pass rsize and wsize into nfs4_init_session] Signed-off-by: Andy Adamson <andros@netapp.com> [separated out removal of rpc_clnt parameter from nfs4_init_session ot a patch of its own] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Pass the nfs_client pointer into nfs4_alloc_session] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: don't assign to session->clp->cl_session in nfs4_destroy_session] [nfs41: fixup nfs4_clear_client_minor_version] [introduce nfs4_clear_client_minor_version() in this patch] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Refactor nfs4_init_session] Moved session allocation into nfs4_init_client_minor_version, called from nfs4_init_client. Leave rwise and wsize initialization in nfs4_init_session, called from nfs4_init_server. Reverted moving of nfs_fsid definition to nfs_fs_sb.h Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: Move NFS4_MAX_SLOT_TABLE define from under CONFIG_NFS_V4_1] [Fix comile error when CONFIG_NFS_V4_1 is not set.] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [moved nfs4_init_slot_table definition to "create_session operation"] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: alloc session with GFP_KERNEL] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
3fd5be9e |
|
01-Apr-2009 |
Mike Sager <sager@netapp.com> |
nfs41: add mount command option minorversion mount -t nfs4 -o minorversion=[0|1] specifies whether to use 4.0 or 4.1. By default, the minorversion is set to 0. Signed-off-by: Mike Sager <sager@netapp.com> [set default minorversion to 0 as per Trond and SteveD's request] Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
b797cac7 |
|
03-Apr-2009 |
David Howells <dhowells@redhat.com> |
NFS: Add mount options to enable local caching on NFS Add NFS mount options to allow the local caching support to be enabled. The attached patch makes it possible for the NFS filesystem to be told to make use of the network filesystem local caching service (FS-Cache). To be able to use this, a recent nfsutils package is required. There are three variant NFS mount options that can be added to a mount command to control caching for a mount. Only the last one specified takes effect: (*) Adding "fsc" will request caching. (*) Adding "fsc=<string>" will request caching and also specify a uniquifier. (*) Adding "nofsc" will disable caching. For example: mount warthog:/ /a -o fsc The cache of a particular superblock (NFS FSID) will be shared between all mounts of that volume, provided they have the same connection parameters and are not marked 'nosharecache'. Where it is otherwise impossible to distinguish superblocks because all the parameters are identical, but the 'nosharecache' option is supplied, a uniquifying string must be supplied, else only the first mount will be permitted to use the cache. If there's a key collision, then the second mount will disable caching and give a warning into the kernel log. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
#
08734048 |
|
03-Apr-2009 |
David Howells <dhowells@redhat.com> |
NFS: Define and create superblock-level objects Define and create superblock-level cache index objects (as managed by nfs_server structs). Each superblock object is created in a server level index object and is itself an index into which inode-level objects are inserted. Ideally there would be one superblock-level object per server, and the former would be folded into the latter; however, since the "nosharecache" option exists this isn't possible. The superblock object key is a sequence consisting of: (1) Certain superblock s_flags. (2) Various connection parameters that serve to distinguish superblocks for sget(). (3) The volume FSID. (4) The security flavour. (5) The uniquifier length. (6) The uniquifier text. This is normally an empty string, unless the fsc=xyz mount option was used to explicitly specify a uniquifier. The key blob is of variable length, depending on the length of (6). The superblock object is given no coherency data to carry in the auxiliary data permitted by the cache. It is assumed that the superblock is always coherent. This patch also adds uniquification handling such that two otherwise identical superblocks, at least one of which is marked "nosharecache", won't end up trying to share the on-disk cache. It will be possible to manually provide a uniquifier through a mount option with a later patch to avoid the error otherwise produced. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Steve Dickson <steved@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Tested-by: Daire Byrne <Daire.Byrne@framestore.com>
|
#
7fe5c398 |
|
19-Mar-2009 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Optimise NFS close() Close-to-open cache consistency rules really only require us to flush out writes on calls to close(), and require us to revalidate attributes on the very last close of the file. Currently we appear to be doing a lot of extra attribute revalidation and cache flushes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
72cb77f4 |
|
11-Mar-2009 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Throttle page dirtying while we're flushing to disk The following patch is a combination of a patch by myself and Peter Staubach. Trond: If we allow other processes to dirty pages while a process is doing a consistency sync to disk, we can end up never making progress. Peter: Attached is a patch which addresses a continuing problem with the NFS client generating out of order WRITE requests. While this is compliant with all of the current protocol specifications, there are servers in the market which can not handle out of order WRITE requests very well. Also, this may lead to sub-optimal block allocations in the underlying file system on the server. This may cause the read throughputs to be reduced when reading the file from the server. Peter: There has been a lot of work recently done to address out of order issues on a systemic level. However, the NFS client is still susceptible to the problem. Out of order WRITE requests can occur when pdflush is in the middle of writing out pages while the process dirtying the pages calls generic_file_buffered_write which calls generic_perform_write which calls balance_dirty_pages_rate_limited which ends up calling writeback_inodes which ends up calling back into the NFS client to writes out dirty pages for the same file that pdflush happens to be working with. Signed-off-by: Peter Staubach <staubach@redhat.com> [modification by Trond to merge the two similar patches] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
50a737f8 |
|
23-Dec-2008 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: "[no]resvport" mount option changes mountd client too If the admin has specified the "noresvport" option for an NFS mount point, the kernel's NFS client uses an unprivileged source port for the main NFS transport. The kernel's mountd client should use an unprivileged port in this case as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
c5d120f8 |
|
23-Dec-2008 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: introduce nfs_mount_info struct for calling nfs_mount() Clean up: convert nfs_mount() to take a single data structure argument to make it simpler to add more arguments. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
146ec944 |
|
23-Dec-2008 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Move declaration of nfs_mount() to fs/nfs/internal.h Clean up: The nfs_mount() function is not to be used outside of the NFS client. Move its public declaration to fs/nfs/internal.h. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
456018d7 |
|
08-Oct-2008 |
J. Bruce Fields <bfields@citi.umich.edu> |
NFS: Cleanup nfs_set_port Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
ea31a443 |
|
20-Aug-2008 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfs: Fix misparsing of nfsv4 fs_locations attribute The code incorrectly assumes here that the server name (or ip address) is null-terminated. This can cause referrals to fail in some cases. Also support ipv6 addresses. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f0c92925 |
|
20-Aug-2008 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfs: prepare to share nfs_set_port We plan to use this function elsewhere. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
1daef0a8 |
|
27-Jul-2008 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Clean up nfs_sb_active/nfs_sb_deactive Instead of causing umount requests to block on server->active_wq while the asynchronous sillyrename deletes are executing, we can use the sb->s_active counter to obtain a reference to the super_block, and then release that reference in nfs_async_unlink_release(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f41f7418 |
|
11-Jun-2008 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Ensure we zap only the access and acl caches when setting new acls ...and ensure that we obey the NFS_INO_INVALID_ACL flag when retrieving the acls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
ecfc555a |
|
14-Mar-2008 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Always enable NFS direct I/O Since O_DIRECT is a standard feature that is enabled in most distros, eliminate the CONFIG_NFS_DIRECTIO build option, and change the fs/nfs/Makefile to always build in the NFS direct I/O engine. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f22d6d79 |
|
14-Mar-2008 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Save the value of the "port=" mount option During a remount based on the mount options displayed in /proc/mounts, we want to preserve the original behavior of the mount request. Let's save the original setting of the "port=" mount option in the mount's nfs_server structure. This allows us to simplify the default behavior of port setting for NFSv4 mounts: by default, NFSv2/3 mounts first try an RPC bind to determine the NFS server's port, unless the user specified the "port=" mount option; Users can force the client to skip the RPC bind by explicitly specifying "port=<value>". NFSv4, by contrast, assumes the NFS server port is 2049 and skips the RPC bind, unless the user specifies "port=". Users can force an RPC bind for NFSv4 by explicitly specifying "port=0". I added a couple of extra comments to clarify this behavior. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
78fa701f |
|
14-Mar-2008 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Fix up data types of fields in nfs_parsed_mount_options Clean up: make data types of fields in nfs_parsed_mount_options more consistent with other uses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f9c3a380 |
|
05-Mar-2008 |
Eric Paris <eparis@redhat.com> |
NFS: use new LSM interfaces to explicitly set mount options NFS and SELinux worked together previously because SELinux had NFS specific knowledge built in. This design was approved by both groups back in 2004 but the recent NFS changes to use nfs_parsed_mount_data and the usage of nfs_clone_mount_data showed this to be a poor fragile solution. This patch fixes the NFS functionality regression by making use of the new LSM interfaces to allow an FS to explicitly set its own mount options. The explicit setting of mount options is done in the nfs get_sb functions which are called before the generic vfs hooks try to set mount options for filesystems which use text mount data. This does not currently support NFSv4 as that functionality did not exist in previous kernels and thus there is no regression. I will be adding the needed code, which I believe to be the exact same as the v3 code, in nfs4_get_sb for 2.6.26. Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: James Morris <jmorris@namei.org>
|
#
5746006f |
|
19-Feb-2008 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Add an nfsiod workqueue NFS post-rpciod cleanups often involve tasks that cannot be safely performed within the rpciod context (due to deadlock concerns). We therefore add a dedicated NFS workqueue that can perform tasks like cleaning up state after an interrupted NFSv4 open() call, or calling put_nfs_open_context() after an asynchronous read or write call. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
3fbd67ad |
|
25-Jan-2008 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4: Iterate through all nfs_clients when the server recalls a delegation The same delegation may have been handed out to more than one nfs_client. Ensure that if a recall occurs, we return all instances. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
4c568017 |
|
10-Dec-2007 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Support non-IPv4 addresses in nfs_parsed_mount_data Replace the nfs_server and mount_server address fields in the nfs_parsed_mount_data structure with a "struct sockaddr_storage" instead of a "struct sockaddr_in". Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
6677d095 |
|
10-Dec-2007 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Adjust nfs_clone_mount structure to store "struct sockaddr *" Change the addr field in the nfs_clone_mount structure to store a "struct sockaddr *" to support non-IPv4 addresses in the NFS client. Note this is mostly a cosmetic change, and does not actually allow referrals using IPv6 addresses. The existing referral code assumes that the server returns a string that represents an IPv4 address. This code needs to support hostnames and IPv6 addresses as well as IPv4 addresses, thus it will need to be reorganized completely (to handle DNS resolution in user space). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
ff052645 |
|
10-Dec-2007 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Change nfs_find_client() to take "struct sockaddr *" Adjust arguments and callers of nfs_find_client() to pass a "struct sockaddr *" instead of "struct sockaddr_in *" to support non-IPv4 addresses. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net> Trond: Also fix up protocol version number argument in nfs_find_client() to use the correct u32 type. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
e887cbcf |
|
26-Oct-2007 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Remove support for the 'mountprog' option Remove the mount option that allows users to specify an alternate mountd program number. The client hasn't support setting an alternate mountd program number for a very long time. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
ad879cef |
|
26-Oct-2007 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Remove support for the 'nfsprog' option Remove the mount option that allows users to specify an alternate NFS program number. The client hasn't support setting an alternate NFS program number for a very long time. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
ef818a28 |
|
08-Nov-2007 |
Steve Dickson <SteveD@redhat.com> |
NFS: Stop sillyname renames and unmounts from racing Added an active/deactive mechanism to the nfs_server structure allowing async operations to hold off umount until the operations are done. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
91ea40b9 |
|
10-Sep-2007 |
\"Talpey, Thomas\ <Thomas.Talpey@netapp.com> |
NFS: use in-kernel mount argument structure for nfsv4 mounts The user-visible nfs4_mount_data does not contain sufficient data to describe new mount options, and also is now a legacy structure. Replace it with the internal nfs_parsed_mount_data for nfsv4 in-kernel use. Signed-off-by: Tom Talpey <tmt@netapp.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
2283f8d6 |
|
10-Sep-2007 |
\"Talpey, Thomas\ <Thomas.Talpey@netapp.com> |
NFS: use in-kernel mount argument structure for nfsv[23] mounts The user-visible nfs_mount_data does not contain sufficient data to describe new mount options, and also is now a legacy structure. Replace it with the internal nfs_parsed_mount_data for nfsv[23] in-kernel use. Signed-off-by: Tom Talpey <tmt@netapp.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
6b18eaa0 |
|
10-Sep-2007 |
\"Talpey, Thomas\ <Thomas.Talpey@netapp.com> |
NFS: move nfs_parsed_mount_data structure definition In preparation for rearranging the nfs mount argument passing, make the nfs_parsed_mount_data struct visible across nfs kernel files. Signed-off-by: Tom Talpey <tmt@netapp.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
9eaa67c6 |
|
30-Jun-2007 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Clean-up: use correct type when converting NFS blocks to local blocks inode->i_blocks is a blkcnt_t these days, which can be a u64 or unsigned long, depending on the setting of CONFIG_LSF. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
8d5658c9 |
|
10-Apr-2007 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Fix a buffer overflow in the allocation of struct nfs_read/writedata Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
c228fd3a |
|
13-Jan-2007 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFSv4: Cleanups for fs_locations code. Start long arduous project... What the hell is struct dentry = {}; all about? Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
49a70f27 |
|
04-Dec-2006 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Cleanup: add common helper nfs_page_length() Clean up a lot of ad-hoc page length calculations in fs/nfs/write.c Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
0dbb4c67 |
|
20-Oct-2006 |
Al Viro <viro@ftp.linux.org.uk> |
[PATCH] xdr annotations: NFS readdir entries on-the-wire data is big-endian [in large part pulled from Alexey's patch] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Acked-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
6aaca566 |
|
22-Aug-2006 |
David Howells <dhowells@redhat.com> |
NFS: Add server and volume lists to /proc Make two new proc files available: /proc/fs/nfsfs/servers /proc/fs/nfsfs/volumes The first lists the servers with which we are currently dealing (struct nfs_client), and the second lists the volumes we have on those servers (struct nfs_server). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
54ceac45 |
|
22-Aug-2006 |
David Howells <dhowells@redhat.com> |
NFS: Share NFS superblocks per-protocol per-server per-FSID The attached patch makes NFS share superblocks between mounts from the same server and FSID over the same protocol. It does this by creating each superblock with a false root and returning the real root dentry in the vfsmount presented by get_sb(). The root dentry set starts off as an anonymous dentry if we don't already have the dentry for its inode, otherwise it simply returns the dentry we already have. We may thus end up with several trees of dentries in the superblock, and if at some later point one of anonymous tree roots is discovered by normal filesystem activity to be located in another tree within the superblock, the anonymous root is named and materialises attached to the second tree at the appropriate point. Why do it this way? Why not pass an extra argument to the mount() syscall to indicate the subpath and then pathwalk from the server root to the desired directory? You can't guarantee this will work for two reasons: (1) The root and intervening nodes may not be accessible to the client. With NFS2 and NFS3, for instance, mountd is called on the server to get the filehandle for the tip of a path. mountd won't give us handles for anything we don't have permission to access, and so we can't set up NFS inodes for such nodes, and so can't easily set up dentries (we'd have to have ghost inodes or something). With this patch we don't actually create dentries until we get handles from the server that we can use to set up their inodes, and we don't actually bind them into the tree until we know for sure where they go. (2) Inaccessible symbolic links. If we're asked to mount two exports from the server, eg: mount warthog:/warthog/aaa/xxx /mmm mount warthog:/warthog/bbb/yyy /nnn We may not be able to access anything nearer the root than xxx and yyy, but we may find out later that /mmm/www/yyy, say, is actually the same directory as the one mounted on /nnn. What we might then find out, for example, is that /warthog/bbb was actually a symbolic link to /warthog/aaa/xxx/www, but we can't actually determine that by talking to the server until /warthog is made available by NFS. This would lead to having constructed an errneous dentry tree which we can't easily fix. We can end up with a dentry marked as a directory when it should actually be a symlink, or we could end up with an apparently hardlinked directory. With this patch we need not make assumptions about the type of a dentry for which we can't retrieve information, nor need we assume we know its place in the grand scheme of things until we actually see that place. This patch reduces the possibility of aliasing in the inode and page caches for inodes that may be accessed by more than one NFS export. It also reduces the number of superblocks required for NFS where there are many NFS exports being used from a server (home directory server + autofs for example). This in turn makes it simpler to do local caching of network filesystems, as it can then be guaranteed that there won't be links from multiple inodes in separate superblocks to the same cache file. Obviously, cache aliasing between different levels of NFS protocol could still be a problem, but at least that gives us another key to use when indexing the cache. This patch makes the following changes: (1) The server record construction/destruction has been abstracted out into its own set of functions to make things easier to get right. These have been moved into fs/nfs/client.c. All the code in fs/nfs/client.c has to do with the management of connections to servers, and doesn't touch superblocks in any way; the remaining code in fs/nfs/super.c has to do with VFS superblock management. (2) The sequence of events undertaken by NFS mount is now reordered: (a) A volume representation (struct nfs_server) is allocated. (b) A server representation (struct nfs_client) is acquired. This may be allocated or shared, and is keyed on server address, port and NFS version. (c) If allocated, the client representation is initialised. The state member variable of nfs_client is used to prevent a race during initialisation from two mounts. (d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find the root filehandle for the mount (fs/nfs/getroot.c). For NFS2/3 we are given the root FH in advance. (e) The volume FSID is probed for on the root FH. (f) The volume representation is initialised from the FSINFO record retrieved on the root FH. (g) sget() is called to acquire a superblock. This may be allocated or shared, keyed on client pointer and FSID. (h) If allocated, the superblock is initialised. (i) If the superblock is shared, then the new nfs_server record is discarded. (j) The root dentry for this mount is looked up from the root FH. (k) The root dentry for this mount is assigned to the vfsmount. (3) nfs_readdir_lookup() creates dentries for each of the entries readdir() returns; this function now attaches disconnected trees from alternate roots that happen to be discovered attached to a directory being read (in the same way nfs_lookup() is made to do for lookup ops). The new d_materialise_unique() function is now used to do this, thus permitting the whole thing to be done under one set of locks, and thus avoiding any race between mount and lookup operations on the same directory. (4) The client management code uses a new debug facility: NFSDBG_CLIENT which is set by echoing 1024 to /proc/net/sunrpc/nfs_debug. (5) Clone mounts are now called xdev mounts. (6) Use the dentry passed to the statfs() op as the handle for retrieving fs statistics rather than the root dentry of the superblock (which is now a dummy). Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
5006a76c |
|
22-Aug-2006 |
David Howells <dhowells@redhat.com> |
NFS: Eliminate client_sys in favour of cl_rpcclient Eliminate nfs_server::client_sys in favour of nfs_client::cl_rpcclient as we only really need one per server that we're talking to since it doesn't have any security on it. The retransmission management variables are also moved to the common struct as they're required to set up the cl_rpcclient connection. The NFS2/3 client and client_acl connections are thenceforth derived by cloning the cl_rpcclient connection and post-applying the authorisation flavour. The code for setting up the initial common connection has been moved to client.c as nfs_create_rpc_client(). All the NFS program definition tables are also moved there as that's where they're now required rather than super.c. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
24c8dbbb |
|
22-Aug-2006 |
David Howells <dhowells@redhat.com> |
NFS: Generalise the nfs_client structure Generalise the nfs_client structure by: (1) Moving nfs_client to a more general place (nfs_fs_sb.h). (2) Renaming its maintenance routines to be non-NFS4 specific. (3) Move those maintenance routines to a new non-NFS4 specific file (client.c) and move the declarations to internal.h. (4) Make nfs_find/get_client() take a full sockaddr_in to include the port number (will be required for NFS2/3). (5) Make nfs_find/get_client() take the NFS protocol version (again will be required to differentiate NFS2, 3 & 4 client records). Also: (6) Make nfs_client construction proceed akin to inodes, marking them as under construction and providing a function to indicate completion. (7) Make nfs_get_client() wait interruptibly if it finds a client that it can share, but that client is currently being constructed. (8) Make nfs4_create_client() use (6) and (7) instead of locking cl_sem. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
7d4e2747 |
|
22-Aug-2006 |
David Howells <dhowells@redhat.com> |
NFS: Fix up split of fs/nfs/inode.c Fix ups for the splitting of the superblock stuff out of fs/nfs/inode.c, including: (*) Move the callback tcpport module param into callback.c. (*) Move the idmap cache timeout module param into idmap.c. (*) Changes to internal.h: (*) namespace-nfs4.c was renamed to nfs4namespace.c. (*) nfs_stat_to_errno() is in nfs2xdr.c, not nfs4xdr.c. (*) nfs4xdr.c is contingent on CONFIG_NFS_V4. (*) nfs4_path() is only uses if CONFIG_NFS_V4 is set. Plus also: (*) The sec_flavours[] table should really be const. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
979df72e |
|
25-Jul-2006 |
Trond Myklebust <Trond.Myklebust@netapp.com> |
NFS: Add an ACCESS cache memory shrinker A pinned inode may in theory end up filling memory with cached ACCESS calls. This patch ensures that the VM may shrink away the cache in these particular cases. The shrinker works by iterating through the list of inodes on the global nfs_access_lru_list, and removing the least recently used access cache entry until it is done (or until the entire cache is empty). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
4ebd9ab3 |
|
02-Jul-2006 |
Dominik Hackl <dominik@hackl.dhs.org> |
[PATCH] nfs: non-procfs build fix This fixes a bug in fs/nfs which makes it impossible to build nfs without having procfs enabled. Signed-off-by: Dominik Hackl <dominik@hackl.dhs.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
266bee88 |
|
27-Jun-2006 |
David Brownell <david-b@pacbell.net> |
[PATCH] fix static linking of NFS Builds on ARM report link problems with common configurations like statically linked NFS (for nfsroot). The symptom is that __init section code references __exit section code; that won't work since the exit sections are discarded (since they can never be called). The best fix for these particular cases would be an "__init_or_exit" section annotation. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
#
d75d5414 |
|
25-Jun-2006 |
Andrew Morton <akpm@osdl.org> |
git-nfs-build-fixes Fix various problems with nfs4 disabled. And various other things. In file included from fs/nfs/inode.c:50: fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want fs/nfs/internal.h: In function 'nfs4_path': fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path' fs/nfs/inode.c: In function 'init_once': fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'open_states' fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'delegation' fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'delegation_state' fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'rwsem' distcc[26452] ERROR: compile fs/nfs/inode.c on g5/64 failed make[1]: *** [fs/nfs/inode.o] Error 1 make: *** [fs/nfs/inode.o] Error 2 make: *** Waiting for unfinished jobs.... In file included from fs/nfs/nfs3xdr.c:26: fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want fs/nfs/internal.h: In function 'nfs4_path': fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path' distcc[26486] ERROR: compile fs/nfs/nfs3xdr.c on g5/64 failed make[1]: *** [fs/nfs/nfs3xdr.o] Error 1 make: *** [fs/nfs/nfs3xdr.o] Error 2 In file included from fs/nfs/nfs3proc.c:24: fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want fs/nfs/internal.h: In function 'nfs4_path': fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path' distcc[26469] ERROR: compile fs/nfs/nfs3proc.c on bix/32 failed make[1]: *** [fs/nfs/nfs3proc.o] Error 1 make: *** [fs/nfs/nfs3proc.o] Error 2 **FAILED** Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andreas Gruenbacher <agruen@suse.de> Cc: Andy Adamson <andros@citi.umich.edu> Cc: Chuck Lever <cel@netapp.com> Cc: David Howells <dhowells@redhat.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Manoj Naik <manoj@almaden.ibm.com> Cc: Marc Eshel <eshel@almaden.ibm.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
|
#
f7b422b1 |
|
09-Jun-2006 |
David Howells <dhowells@redhat.com> |
NFS: Split fs/nfs/inode.c As fs/nfs/inode.c is rather large, heterogenous and unwieldy, the attached patch splits it up into a number of files: (*) fs/nfs/inode.c Strictly inode specific functions. (*) fs/nfs/super.c Superblock management functions for NFS and NFS4, normal access, clones and referrals. The NFS4 superblock functions _could_ move out into a separate conditionally compiled file, but it's probably not worth it as there're so many common bits. (*) fs/nfs/namespace.c Some namespace-specific functions have been moved here. (*) fs/nfs/nfs4namespace.c NFS4-specific namespace functions (this could be merged into the previous file). This file is conditionally compiled. (*) fs/nfs/internal.h Inter-file declarations, plus a few simple utility functions moved from fs/nfs/inode.c. Additionally, all the in-.c-file externs have been moved here, and those files they were moved from now includes this file. For the most part, the functions have not been changed, only some multiplexor functions have changed significantly. I've also: (*) Added some extra banner comments above some functions. (*) Rearranged the function order within the files to be more logical and better grouped (IMO), though someone may prefer a different order. (*) Reduced the number of #ifdefs in .c files. (*) Added missing __init and __exit directives. Signed-Off-By: David Howells <dhowells@redhat.com>
|