#
8ddb7142 |
|
23-Apr-2024 |
Chuck Lever <chuck.lever@oracle.com> |
Revert "NFSD: Convert the callback workqueue to use delayed_work" This commit was a pre-requisite for commit c1ccfcf1a9bf ("NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down"), which has already been reverted. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
c5967721 |
|
15-Feb-2024 |
Dai Ngo <dai.ngo@oracle.com> |
NFSD: handle GETATTR conflict with write delegation If the GETATTR request on a file that has write delegation in effect and the request attributes include the change info and size attribute then the request is handled as below: Server sends CB_GETATTR to client to get the latest change info and file size. If these values are the same as the server's cached values then the GETATTR proceeds as normal. If either the change info or file size is different from the server's cached values, or the file was already marked as modified, then: . update time_modify and time_metadata into file's metadata with current time . encode GETATTR as normal except the file size is encoded with the value returned from CB_GETATTR . mark the file as modified If the CB_GETATTR fails for any reasons, the delegation is recalled and NFS4ERR_DELAY is returned for the GETATTR. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
6487a13b |
|
15-Feb-2024 |
Dai Ngo <dai.ngo@oracle.com> |
NFSD: add support for CB_GETATTR callback Includes: . CB_GETATTR proc for nfs4_cb_procedures[] . XDR encoding and decoding function for CB_GETATTR request/reply . add nfs4_cb_fattr to nfs4_delegation for sending CB_GETATTR and store file attributes from client's reply. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
f8104027 |
|
08-Feb-2024 |
Chen Hanxiao <chenhx.fnst@fujitsu.com> |
nfsd: clean up comments over nfs4_client definition nfsd fault injection has been deprecated since commit 9d60d93198c6 ("Deprecate nfsd fault injection") and removed by commit e56dc9e2949e ("nfsd: remove fault injection code") So remove the outdated parts about fault injection. Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
1ac3629b |
|
29-Jan-2024 |
NeilBrown <neilb@suse.de> |
nfsd: prepare for supporting admin-revocation of state The NFSv4 protocol allows state to be revoked by the admin and has error codes which allow this to be communicated to the client. This patch - introduces a new state-id status SC_STATUS_ADMIN_REVOKED which can be set on open, lock, or delegation state. - reports NFS4ERR_ADMIN_REVOKED when these are accessed - introduces a per-client counter of these states and returns SEQ4_STATUS_ADMIN_STATE_REVOKED when the counter is not zero. Decrements this when freeing any admin-revoked state. - introduces stub code to find all interesting states for a given superblock so they can be revoked via the 'unlock_filesystem' file in /proc/fs/nfsd/ No actual states are handled yet. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
3f29cc82 |
|
29-Jan-2024 |
NeilBrown <neilb@suse.de> |
nfsd: split sc_status out of sc_type sc_type identifies the type of a state - open, lock, deleg, layout - and also the status of a state - closed or revoked. This is a bit untidy and could get worse when "admin-revoked" states are added. So clean it up. With this patch, the type is now all that is stored in sc_type. This is zero when the state is first added to ->cl_stateids (causing it to be ignored), and is then set appropriately once it is fully initialised. It is set under ->cl_lock to ensure atomicity w.r.t lookup. It is now never cleared. sc_type is still a bit-set even though at most one bit is set. This allows lookup functions to be given a bitmap of acceptable types. sc_type is now an unsigned short rather than char. There is no value in restricting to just 8 bits. All the constants now start SC_TYPE_ matching the field in which they are stored. Keeping the existing names and ensuring clear separation from non-type flags would have required something like NFS4_STID_TYPE_CLOSED which is cumbersome. The "NFS4" prefix is redundant was they only appear in NFS4 code, so remove that and change STID to SC to match the field. The status is stored in a separate unsigned short named "sc_status". It has two flags: SC_STATUS_CLOSED and SC_STATUS_REVOKED. CLOSED combines NFS4_CLOSED_STID, NFS4_CLOSED_DELEG_STID, and is used for SC_TYPE_LOCK and SC_TYPE_LAYOUT instead of setting the sc_type to zero. These flags are only ever set, never cleared. For deleg stateids they are set under the global state_lock. For open and lock stateids they are set under ->cl_lock. For layout stateids they are set under ->ls_lock nfs4_unhash_stid() has been removed, and we never set sc_type = 0. This was only used for LOCK and LAYOUT stids and they now use SC_STATUS_CLOSED. Also TRACE_DEFINE_NUM() calls for the various STID #define have been removed because these things are not enums, and so that call is incorrect. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
fe0e9580 |
|
25-Jan-2024 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Convert the callback workqueue to use delayed_work Normally, NFSv4 callback operations are supposed to be sent to the client as soon as they are queued up. In a moment, I will introduce a recovery path where the server has to wait for the client to reconnect. We don't want a hard busy wait here -- the callback should be requeued to try again in several milliseconds. For now, convert nfsd4_callback from struct work_struct to struct delayed_work, and queue with a zero delay argument. This should avoid behavior changes for current operation. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
1227561c |
|
15-Dec-2023 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Revert 738401a9bd1ac34ccd5723d69640a4adbb1a4bc0 There's nothing wrong with this commit, but this is dead code now that nothing triggers a CB_GETATTR callback. It can be re-introduced once the issues with handling conflicting GETATTRs are resolved. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
862bee84 |
|
16-Dec-2023 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Revert 6c41d9a9bd0298002805758216a9c44e38a8500d For some reason, the wait_on_bit() in nfsd4_deleg_getattr_conflict() is waiting forever, preventing a clean server shutdown. The requesting client might also hang waiting for a reply to the conflicting GETATTR. Invoking wait_on_bit() in an nfsd thread context is a hazard. The correct fix is to replace this wait_on_bit() call site with a mechanism that defers the conflicting GETATTR until the CB_GETATTR completes or is known to have failed. That will require some surgery and extended testing and it's late in the v6.7-rc cycle, so I'm reverting now in favor of trying again in a subsequent kernel release. This is my fault: I should have recognized the ramifications of calling wait_on_bit() in here before accepting this patch. Thanks to Dai Ngo <dai.ngo@oracle.com> for diagnosing the issue. Reported-by: Wolfgang Walter <linux-nfs@stwm.de> Closes: https://lore.kernel.org/linux-nfs/e3d43ecdad554fbdcaa7181833834f78@stwm.de/ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
bd9d6a3e |
|
11-Sep-2023 |
Lorenzo Bianconi <lorenzo@kernel.org> |
NFSD: add rpc_status netlink support Introduce rpc_status netlink support for NFSD in order to dump pending RPC requests debugging information from userspace. Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=366 Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
6c41d9a9 |
|
13-Sep-2023 |
Dai Ngo <dai.ngo@oracle.com> |
NFSD: handle GETATTR conflict with write delegation If the GETATTR request on a file that has write delegation in effect and the request attributes include the change info and size attribute then the request is handled as below: Server sends CB_GETATTR to client to get the latest change info and file size. If these values are the same as the server's cached values then the GETATTR proceeds as normal. If either the change info or file size is different from the server's cached values, or the file was already marked as modified, then: . update time_modify and time_metadata into file's metadata with current time . encode GETATTR as normal except the file size is encoded with the value returned from CB_GETATTR . mark the file as modified If the CB_GETATTR fails for any reasons, the delegation is recalled and NFS4ERR_DELAY is returned for the GETATTR. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
738401a9 |
|
13-Sep-2023 |
Dai Ngo <dai.ngo@oracle.com> |
NFSD: add support for CB_GETATTR callback Includes: . CB_GETATTR proc for nfs4_cb_procedures[] . XDR encoding and decoding function for CB_GETATTR request/reply . add nfs4_cb_fattr to nfs4_delegation for sending CB_GETATTR and store file attributes from client's reply. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
fd19ca36 |
|
29-Jun-2023 |
Dai Ngo <dai.ngo@oracle.com> |
NFSD: handle GETATTR conflict with write delegation If the GETATTR request on a file that has write delegation in effect and the request attributes include the change info and size attribute then the write delegation is recalled. If the delegation is returned within 30ms then the GETATTR is serviced as normal otherwise the NFS4ERR_DELAY error is returned for the GETATTR. Add counter for write delegation recall due to conflict GETATTR. This is used to evaluate the need to implement CB_GETATTR to adoid recalling the delegation with conflit GETATTR. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
fcb53097 |
|
17-Jan-2023 |
Jeff Layton <jlayton@kernel.org> |
nfsd: don't take nfsd4_copy ref for OP_OFFLOAD_STATUS We're not doing any blocking operations for OP_OFFLOAD_STATUS, so taking and putting a reference is a waste of effort. Take the client lock, search for the copy and fetch the wr_bytes_written field and return. Also, make find_async_copy a static function. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
44df6f43 |
|
16-Nov-2022 |
Dai Ngo <dai.ngo@oracle.com> |
NFSD: add delegation reaper to react to low memory condition The delegation reaper is called by nfsd memory shrinker's on the 'count' callback. It scans the client list and sends the courtesy CB_RECALL_ANY to the clients that hold delegations. To avoid flooding the clients with CB_RECALL_ANY requests, the delegation reaper sends only one CB_RECALL_ANY request to each client per 5 seconds. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> [ cel: moved definition of RCA4_TYPE_MASK_RDATA_DLG ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
3959066b |
|
16-Nov-2022 |
Dai Ngo <dai.ngo@oracle.com> |
NFSD: add support for sending CB_RECALL_ANY Add XDR encode and decode function for CB_RECALL_ANY. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
d47b295e |
|
28-Oct-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Use rhashtable for managing nfs4_file objects fh_match() is costly, especially when filehandles are large (as is the case for NFSv4). It needs to be used sparingly when searching data structures. Unfortunately, with common workloads, I see multiple thousands of objects stored in file_hashtbl[], which has just 256 buckets, making its bucket hash chains quite lengthy. Walking long hash chains with the state_lock held blocks other activity that needs that lock. Sizable hash chains are a common occurrance once the server has handed out some delegations, for example -- IIUC, each delegated file is held open on the server by an nfs4_file object. To help mitigate the cost of searching with fh_match(), replace the nfs4_file hash table with an rhashtable, which can dynamically resize its bucket array to minimize hash chain length. The result of this modification is an improvement in the latency of NFSv4 operations, and the reduction of nfsd CPU utilization due to eliminating the cost of multiple calls to fh_match() and reducing the CPU cache misses incurred while walking long hash chains in the nfs4_file hash table. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org>
|
#
b95239ca |
|
26-Sep-2022 |
Jeff Layton <jlayton@kernel.org> |
nfsd: make nfsd4_run_cb a bool return function queue_work can return false and not queue anything, if the work is already queued. If that happens in the case of a CB_RECALL, we'll have taken an extra reference to the stid that will never be put. Ensure we throw a warning in that case. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
781fde1a |
|
22-Sep-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Rename the fields in copy_stateid_t Code maintenance: The name of the copy_stateid_t::sc_count field collides with the sc_count field in struct nfs4_stid, making the latter difficult to grep for when auditing stateid reference counting. No behavior change expected. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
18224dc5 |
|
09-Sep-2022 |
Gaosheng Cui <cuigaosheng1@huawei.com> |
nfsd: remove nfsd4_prepare_cb_recall() declaration nfsd4_prepare_cb_recall() has been removed since commit 0162ac2b978e ("nfsd: introduce nfsd4_callback_ops"), so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
80e591ce |
|
02-Sep-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Increase NFSD_MAX_OPS_PER_COMPOUND When attempting an NFSv4 mount, a Solaris NFSv4 client builds a single large COMPOUND that chains a series of LOOKUPs to get to the pseudo filesystem root directory that is to be mounted. The Linux NFS server's current maximum of 16 operations per NFSv4 COMPOUND is not large enough to ensure that this works for paths that are more than a few components deep. Since NFSD_MAX_OPS_PER_COMPOUND is mostly a sanity check, and most NFSv4 COMPOUNDS are between 3 and 6 operations (thus they do not trigger any re-allocation of the operation array on the server), increasing this maximum should result in little to no impact. The ops array can get large now, so allocate it via vmalloc() to help ensure memory fragmentation won't cause an allocation failure. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216383 Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
8ea6e2c9 |
|
27-Jul-2022 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Make nfs4_put_copy() static Clean up: All call sites are in fs/nfsd/nfs4proc.c. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
66af2579 |
|
02-May-2022 |
Dai Ngo <dai.ngo@oracle.com> |
NFSD: add courteous server support for thread with only delegation This patch provides courteous server support for delegation only. Only expired client with delegation but no conflict and no open or lock state is allowed to be in COURTESY state. Delegation conflict with COURTESY/EXPIRABLE client is resolved by setting it to EXPIRABLE, queue work for the laundromat and return delay to the caller. Conflict is resolved when the laudromat runs and expires the EXIRABLE client while the NFS client retries the OPEN request. Local thread request that gets conflict is doing the retry in _break_lease. Client in COURTESY or EXPIRABLE state is allowed to reconnect and continues to have access to its state. Access to the nfs4_client by the reconnecting thread and the laundromat is serialized via the client_lock. Reviewed-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
47446d74 |
|
16-Dec-2021 |
Vasily Averin <vvs@virtuozzo.com> |
nfsd4: add refcount for nfsd4_blocked_lock nbl allocated in nfsd4_lock can be released by a several ways: directly in nfsd4_lock(), via nfs4_laundromat(), via another nfs command RELEASE_LOCKOWNER or via nfsd4_callback. This structure should be refcounted to be used and released correctly in all these cases. Refcount is initialized to 1 during allocation and is incremented when nbl is added into nbl_list/nbl_lru lists. Usually nbl is linked into both lists together, so only one refcount is used for both lists. However nfsd4_lock() should keep in mind that nbl can be present in one of lists only. This can happen if nbl was handled already by nfs4_laundromat/nfsd4_callback/etc. Refcount is decremented if vfs_lock_file() returns FILE_LOCK_DEFERRED, because nbl can be handled already by nfs4_laundromat/nfsd4_callback/etc. Refcount is not changed in find_blocked_lock() because of it reuses counter released after removing nbl from lists. Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
3dcd1d8a |
|
07-Dec-2021 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: improve stateid access bitmask documentation The use of the bitmaps is confusing. Add a cross-reference to make it easier to find the existing comment. Add an updated reference with URL to make it quicker to look up. And a bit more editorializing about the value of this. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
a0ce4837 |
|
16-Apr-2021 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: track filehandle aliasing in nfs4_files It's unusual but possible for multiple filehandles to point to the same file. In that case, we may end up with multiple nfs4_files referencing the same inode. For delegation purposes it will turn out to be useful to flag those cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
f9b60e22 |
|
16-Apr-2021 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: hash nfs4_files by inode number The nfs4_file structure is per-filehandle, not per-inode, because the spec requires open and other state to be per filehandle. But it will turn out to be convenient for nfs4_files associated with the same inode to be hashed to the same bucket, so let's hash on the inode instead of the filehandle. Filehandle aliasing is rare, so that shouldn't have much performance impact. (If you have a ton of exported filesystems, though, and all of them have a root with inode number 2, could that get you an overlong hash chain? Perhaps this (and the v4 open file cache) should be hashed on the inode pointer instead.) Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
472d155a |
|
19-Mar-2021 |
NeilBrown <neilb@suse.de> |
nfsd: report client confirmation status in "info" file mountd can now monitor clients appearing and disappearing in /proc/fs/nfsd/clients, and will log these events, in liu of the logging of mount/unmount events for NFSv3. Currently it cannot distinguish between unconfirmed clients (which might be transient and totally uninteresting) and confirmed clients. So add a "status: " line which reports either "confirmed" or "unconfirmed", and use fsnotify to report that the info file has been modified. This requires a bit of infrastructure to keep the dentry for the "info" file. There is no need to take a counted reference as the dentry must remain around until the client is removed. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
1722b046 |
|
21-Jan-2021 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: simplify nfsd4_check_open_reclaim The set_client() was already taken care of by process_open1(). The comments here are mostly redundant with the code. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
e56dc9e2 |
|
30-Jul-2020 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: remove fault injection code It was an interesting idea but nobody seems to be using it, it's buggy at this point, and nfs4state.c is already complicated enough without it. The new nfsd/clients/ code provides some of the same functionality, and could probably do more if desired. This feature has been deprecated since 9d60d93198c6 ("Deprecate nfsd fault injection"). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
dd5e3fbc |
|
05-Apr-2020 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: Add tracepoints to the NFSD state management code Capture obvious events and replace dprintk() call sites. Introduce infrastructure so that adding more tracepoints in this code later is simplified. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
20b7d86f |
|
04-Nov-2019 |
Arnd Bergmann <arnd@arndb.de> |
nfsd: use boottime for lease expiry calculation A couple of time_t variables are only used to track the state of the lease time and its expiration. The code correctly uses the 'time_after()' macro to make this work on 32-bit architectures even beyond year 2038, but the get_seconds() function and the time_t type itself are deprecated as they behave inconsistently between 32-bit and 64-bit architectures and often lead to code that is not y2038 safe. As a minor issue, using get_seconds() leads to problems with concurrent settimeofday() or clock_settime() calls, in the worst case timeout never triggering after the time has been set backwards. Change nfsd to use time64_t and ktime_get_boottime_seconds() here. This is clearly excessive, as boottime by itself means we never go beyond 32 bits, but it does mean we handle this correctly and consistently without having to worry about corner cases and should be no more expensive than the previous implementation on 64-bit architectures. The max_cb_time() function gets changed in order to avoid an expensive 64-bit division operation, but as the lease time is at most one hour, there is no change in behavior. Also do the same for server-to-server copy expiration time. Signed-off-by: Arnd Bergmann <arnd@arndb.de> [bfields@redhat.com: fix up copy expiration] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
9594497f |
|
04-Nov-2019 |
Arnd Bergmann <arnd@arndb.de> |
nfsd: fix jiffies/time_t mixup in LRU list The nfsd4_blocked_lock->nbl_time timestamp is recorded in jiffies, but then compared to a CLOCK_REALTIME timestamp later on, which makes no sense. For consistency with the other timestamps, change this to use a time_t. This is a change in behavior, which may cause regressions, but the current code is not sensible. On a system with CONFIG_HZ=1000, the 'time_after((unsigned long)nbl->nbl_time, (unsigned long)cutoff))' check is false for roughly the first 18 days of uptime and then true for the next 49 days. Fixes: 7919d0a27f1e ("nfsd: add a LRU list for blocked locks") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
e29f4703 |
|
31-Oct-2019 |
Arnd Bergmann <arnd@arndb.de> |
nfsd: print 64-bit timestamps in client_info_show The nii_time field gets truncated to 'time_t' on 32-bit architectures before printing. Remove the use of 'struct timespec' to product the correct output beyond 2038. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
ce0887ac |
|
09-Oct-2019 |
Olga Kornievskaia <kolga@netapp.com> |
NFSD add nfs4 inter ssc to nfsd4_copy Given a universal address, mount the source server from the destination server. Use an internal mount. Call the NFS client nfs42_ssc_open to obtain the NFS struct file suitable for nfsd_copy_range. Ability to do "inter" server-to-server depends on the an nfsd kernel parameter "inter_copy_offload_enable". Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
|
#
624322f1 |
|
04-Oct-2019 |
Olga Kornievskaia <olga.kornievskaia@gmail.com> |
NFSD add COPY_NOTIFY operation Introducing the COPY_NOTIFY operation. Create a new unique stateid that will keep track of the copy state and the upcoming READs that will use that stateid. Each associated parent stateid has a list of copy notify stateids. A copy notify structure makes a copy of the parent stateid and a clientid and will use it to look up the parent stateid during the READ request (suggested by Trond Myklebust <trond.myklebust@hammerspace.com>). At nfs4_put_stid() time, we walk the list of the associated copy notify stateids and delete them. Laundromat thread will traverse globally stored copy notify stateid in idr and notice if any haven't been referenced in the lease period, if so, it'll remove them. Return single netaddr to advertise to the copy. Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Andy Adamson <andros@netapp.com>
|
#
2bbfed98 |
|
23-Oct-2019 |
Trond Myklebust <trondmy@gmail.com> |
nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback() When we're destroying the client lease, and we call nfsd4_shutdown_callback(), we must ensure that we do not return before all outstanding callbacks have terminated and have released their payloads. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
6ee95d1c |
|
09-Sep-2019 |
Scott Mayhew <smayhew@redhat.com> |
nfsd: add support for upcall version 2 Version 2 upcalls will allow the nfsd to include a hash of the kerberos principal string in the Cld_Create upcall. If a principal is present in the svc_cred, then the hash will be included in the Cld_Create upcall. We attempt to use the svc_cred.cr_raw_principal (which is returned by gssproxy) first, and then fall back to using the svc_cred.cr_principal (which is returned by both gssproxy and rpc.svcgssd). Upon a subsequent restart, the hash will be returned in the Cld_Gracestart downcall and stored in the reclaim_str_hashtbl so it can be used when handling reclaim opens. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
5c4583b2 |
|
18-Aug-2019 |
Jeff Layton <jeff.layton@primarydata.com> |
nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache Have nfs4_preprocess_stateid_op pass back a nfsd_file instead of a filp. Since we now presume that the struct file will be persistent in most cases, we can stop fiddling with the raparms in the read code. This also means that we don't really care about the rd_tmp_file field anymore. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
eb82dd39 |
|
18-Aug-2019 |
Jeff Layton <jeff.layton@primarydata.com> |
nfsd: convert fi_deleg_file and ls_file fields to nfsd_file Have them keep an nfsd_file reference instead of a struct file. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
fd4f83fd |
|
18-Aug-2019 |
Jeff Layton <jeff.layton@primarydata.com> |
nfsd: convert nfs4_file->fi_fds array to use nfsd_files Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
79123444 |
|
04-Jun-2019 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: decode implementation id Decode the implementation ID and display in nfsd/clients/#/info. It may be help identify the client. It won't be used otherwise. (When this went into the protocol, I thought the implementation ID would be a slippery slope towards implementation-specific workarounds as with the http user-agent. But I guess I was wrong, the risk seems pretty low now.) Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
e8a79fb1 |
|
22-Mar-2019 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: add nfsd/clients directory I plan to expose some information about nfsv4 clients here. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
59f8e91b |
|
20-Mar-2019 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: use reference count to free client Keep a second reference count which is what is really used to decide when to free the client's memory. Next I'm going to add an nfsd/clients/ directory with a subdirectory for each NFSv4 client. File objects under nfsd/clients/ will hold these references. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
14ed14cc |
|
20-Mar-2019 |
J. Bruce Fields <bfields@redhat.com> |
nfsd: rename cl_refcount Rename this to a more descriptive name: it counts the number of in-progress rpc's referencing this client. Next I'm going to add a second refcount with a slightly different use. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
15b6ff95 |
|
12-Jun-2019 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
nfsd: no need to check return value of debugfs_create functions When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: linux-nfs@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20190612152603.GB18440@kroah.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
6b189105 |
|
26-Mar-2019 |
Scott Mayhew <smayhew@redhat.com> |
nfsd: make nfs4_client_reclaim use an xdr_netobj instead of a fixed char array This will allow the reclaim_str_hashtbl to store either the recovery directory names used by the legacy client tracking code or the full client strings used by the nfsdcld client tracking code. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
e6abc8ca |
|
05-Apr-2019 |
Trond Myklebust <trondmy@gmail.com> |
nfsd: Don't release the callback slot unless it was actually held If there are multiple callbacks queued, waiting for the callback slot when the callback gets shut down, then they all currently end up acting as if they hold the slot, and call nfsd4_cb_sequence_done() resulting in interesting side-effects. In addition, the 'retry_nowait' path in nfsd4_cb_sequence_done() causes a loop back to nfsd4_cb_prepare() without first freeing the slot, which causes a deadlock when nfsd41_cb_get_slot() gets called a second time. This patch therefore adds a boolean to track whether or not the callback did pick up the slot, so that it can do the right thing in these 2 cases. Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
a52458b4 |
|
02-Dec-2018 |
NeilBrown <neilb@suse.com> |
NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'. SUNRPC has two sorts of credentials, both of which appear as "struct rpc_cred". There are "generic credentials" which are supplied by clients such as NFS and passed in 'struct rpc_message' to indicate which user should be used to authorize the request, and there are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS which describe the credential to be sent over the wires. This patch replaces all the generic credentials by 'struct cred' pointers - the credential structure used throughout Linux. For machine credentials, there is a special 'struct cred *' pointer which is statically allocated and recognized where needed as having a special meaning. A look-up of a low-level cred will map this to a machine credential. Signed-off-by: NeilBrown <neilb@suse.com> Acked-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
e0639dc5 |
|
20-Jul-2018 |
Olga Kornievskaia <kolga@netapp.com> |
NFSD introduce async copy feature Upon receiving a request for async copy, create a new kthread. If we get asynchronous request, make sure to copy the needed arguments/state from the stack before starting the copy. Then start the thread and reply back to the client indicating copy is asynchronous. nfsd_copy_file_range() will copy in a loop over the total number of bytes is needed to copy. In case a failure happens in the middle, we ignore the error and return how much we copied so far. Once done creating a workitem for the callback workqueue and send CB_OFFLOAD with the results. The lifetime of the copy stateid is bound to the vfs copy. This way we don't need to keep the nfsd_net structure for the callback. We could keep it around longer so that an OFFLOAD_STATUS that came late would still get results, but clients should be able to deal without that. We handle OFFLOAD_CANCEL by sending a signal to the copy thread and calling kthread_stop. A client should cancel any ongoing copies before calling DESTROY_CLIENT; if not, we return a CLIENT_BUSY error. If the client is destroyed for some other reason (lease expiration, or server shutdown), we must clean up any ongoing copies ourselves. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> [colin.king@canonical.com: fix leak in error case] [bfields@fieldses.org: remove signalling, merge patches] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
9eb190fca8f |
|
20-Jul-2018 |
Olga Kornievskaia <kolga@netapp.com> |
NFSD CB_OFFLOAD xdr Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
a26dd64f |
|
15-Aug-2018 |
Chuck Lever <chuck.lever@oracle.com> |
nfsd: Remove callback_cred Clean up: The global callback_cred is no longer used, so it can be removed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
818a34eb |
|
19-Oct-2017 |
Elena Reshetova <elena.reshetova@intel.com> |
fs, nfsd: convert nfs4_file.fi_ref from atomic_t to refcount_t atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nfs4_file.fi_ref is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
cff7cb2e |
|
19-Oct-2017 |
Elena Reshetova <elena.reshetova@intel.com> |
fs, nfsd: convert nfs4_cntl_odstate.co_odcount from atomic_t to refcount_t atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nfs4_cntl_odstate.co_odcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
a15dfcd5 |
|
19-Oct-2017 |
Elena Reshetova <elena.reshetova@intel.com> |
fs, nfsd: convert nfs4_stid.sc_count from atomic_t to refcount_t atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable nfs4_stid.sc_count is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
53da6a53 |
|
17-Oct-2017 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: catch some false session retries The spec allows us to return NFS4ERR_SEQ_FALSE_RETRY if we notice that the client is making a call that matches a previous (slot, seqid) pair but that *isn't* actually a replay, because some detail of the call doesn't actually match the previous one. Catching every such case is difficult, but we may as well catch a few easy ones. This also handles the case described in the previous patch, in a different way. The spec does however require us to catch the case where the difference is in the rpc credentials. This prevents somebody from snooping another user's replies by fabricating retries. (But the practical value of the attack is limited by the fact that the replies with the most sensitive data are READ replies, which are not normally cached.) Tested-by: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
085def3a |
|
18-Oct-2017 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: fix cached replies to solo SEQUENCE compounds Currently our handling of 4.1+ requests without "cachethis" set is confusing and not quite correct. Suppose a client sends a compound consisting of only a single SEQUENCE op, and it matches the seqid in a session slot (so it's a retry), but the previous request with that seqid did not have "cachethis" set. The obvious thing to do might be to return NFS4ERR_RETRY_UNCACHED_REP, but the protocol only allows that to be returned on the op following the SEQUENCE, and there is no such op in this case. The protocol permits us to cache replies even if the client didn't ask us to. And it's easy to do so in the case of solo SEQUENCE compounds. So, when we get a solo SEQUENCE, we can either return the previously cached reply or NFSERR_SEQ_FALSE_RETRY if we notice it differs in some way from the original call. Currently, we're returning a corrupt reply in the case a solo SEQUENCE matches a previous compound with more ops. This actually matters because the Linux client recently started doing this as a way to recover from lost replies to idempotent operations in the case the process doing the original reply was killed: in that case it's difficult to keep the original arguments around to do a real retry, and the client no longer cares what the result is anyway, but it would like to make sure that the slot's sequence id has been incremented, and the solo SEQUENCE assures that: if the server never got the original reply, it will increment the sequence id. If it did get the original reply, it won't increment, and nothing else that about the reply really matters much. But we can at least attempt to return valid xdr! Tested-by: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f7d1ddbe |
|
04-Feb-2017 |
Kinglong Mee <kinglongmee@gmail.com> |
nfsd/callback: Cleanup callback cred on shutdown The rpccred gotten from rpc_lookup_machine_cred() should be put when state is shutdown. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
d19fb70d |
|
18-Jan-2017 |
Kinglong Mee <kinglongmee@gmail.com> |
NFSD: Fix a null reference case in find_or_create_lock_stateid() nfsd assigns the nfs4_free_lock_stateid to .sc_free in init_lock_stateid(). If nfsd doesn't go through init_lock_stateid() and put stateid at end, there is a NULL reference to .sc_free when calling nfs4_put_stid(ns). This patch let the nfs4_stid.sc_free assignment to nfs4_alloc_stid(). Cc: stable@vger.kernel.org Fixes: 356a95ece7aa "nfsd: clean up races in lock stateid searching..." Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
7919d0a2 |
|
16-Sep-2016 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add a LRU list for blocked locks It's possible for a client to call in on a lock that is blocked for a long time, but discontinue polling for it. A malicious client could even set a lock on a file, and then spam the server with failing lock requests from different lockowners that pile up in a DoS attack. Add the blocked lock structures to a per-net namespace LRU when hashing them, and timestamp them. If the lock request is not revisited after a lease period, we'll drop it under the assumption that the client is no longer interested. This also gives us a mechanism to clean up these objects at server shutdown time as well. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
76d348fa |
|
16-Sep-2016 |
Jeff Layton <jlayton@kernel.org> |
nfsd: have nfsd4_lock use blocking locks for v4.1+ locks Create a new per-lockowner+per-inode structure that contains a file_lock. Have nfsd4_lock add this structure to the lockowner's list prior to setting the lock. Then call the vfs and request a blocking lock (by setting FL_SLEEP). If we get anything besides FILE_LOCK_DEFERRED back, then we dequeue the block structure and free it. When the next lock request comes in, we'll look for an existing block for the same filehandle and dequeue and reuse it if there is one. When the lock comes free (a'la an lm_notify call), we dequeue it from the lockowner's list and kick off a CB_NOTIFY_LOCK callback to inform the client that it should retry the lock request. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
a188620e |
|
16-Sep-2016 |
Jeff Layton <jlayton@kernel.org> |
nfsd: plumb in a CB_NOTIFY_LOCK operation Add the encoding/decoding for CB_NOTIFY_LOCK operations. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
89dfdc96 |
|
16-Aug-2016 |
Jeff Layton <jlayton@kernel.org> |
nfsd: eliminate cb_minorversion field We already have that info in the client pointer. No need to pass around a copy. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
ed941643 |
|
14-Jun-2016 |
Andrew Elble <aweits@rit.edu> |
nfsd: implement machine credential support for some operations This addresses the conundrum referenced in RFC5661 18.35.3, and will allow clients to return state to the server using the machine credentials. The biggest part of the problem is that we need to allow the client to send a compound op with integrity/privacy on mounts that don't have it enabled. Add server support for properly decoding and using spo_must_enforce and spo_must_allow bits. Add support for machine credentials to be used for CLOSE, OPEN_DOWNGRADE, LOCKU, DELEGRETURN, and TEST/FREE STATEID. Implement a check so as to not throw WRONGSEC errors when these operations are used if integrity/privacy isn't turned on. Without this, Linux clients with credentials that expired while holding delegations were getting stuck in an endless loop. Signed-off-by: Andrew Elble <aweits@rit.edu> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
feb9dad5 |
|
14-Jun-2016 |
Oleg Drokin <green@linuxhacker.ru> |
nfsd: Always lock state exclusively. It used to be the case that state had an rwlock that was locked for write by downgrades, but for read for upgrades (opens). Well, the problem is if there are two competing opens for the same state, they step on each other toes potentially leading to leaking file descriptors from the state structure, since access mode is a bitmap only set once. Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
14b7f4a1 |
|
05-May-2016 |
Jeff Layton <jlayton@kernel.org> |
nfsd: handle seqid wraparound in nfsd4_preprocess_layout_stateid Move the existing static function to an inline helper, and call it. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
aa0d6aed |
|
02-Dec-2015 |
Anna Schumaker <Anna.Schumaker@netapp.com> |
nfsd: Pass filehandle to nfs4_preprocess_stateid_op() This will be needed so COPY can look up the saved_fh in addition to the current_fh. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
#
c4cb8974 |
|
21-Nov-2015 |
Julia Lawall <Julia.Lawall@lip6.fr> |
nfsd: constify nfsd4_callback_ops structure The nfsd4_callback_ops structure is never modified, so declare it as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
9767feb2 |
|
01-Oct-2015 |
Jeff Layton <jlayton@kernel.org> |
nfsd: ensure that seqid morphing operations are atomic wrt to copies Bruce points out that the increment of the seqid in stateids is not serialized in any way, so it's possible for racing calls to bump it twice and end up sending the same stateid. While we don't have any reports of this problem it _is_ theoretically possible, and could lead to spurious state recovery by the client. In the current code, update_stateid is always followed by a memcpy of that stateid, so we can combine the two operations. For better atomicity, we add a spinlock to the nfs4_stid and hold that when bumping the seqid and copying the stateid. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
cc8a5532 |
|
17-Sep-2015 |
Jeff Layton <jlayton@kernel.org> |
nfsd: serialize layout stateid morphing operations In order to allow the client to make a sane determination of what happened with racing LAYOUTGET/LAYOUTRETURN/CB_LAYOUTRECALL calls, we must ensure that the seqids return accurately represent the order of operations. The simplest way to do that is to ensure that operations on a single stateid are serialized. This patch adds a mutex to the layout stateid, and locks it when checking the layout stateid's seqid. The mutex is held over the entire operation and released after the seqid is bumped. Note that in the case of CB_LAYOUTRECALL we must move the increment of the seqid and setting into a new cb "prepare" operation. The lease infrastructure will call the lm_break callback with a spinlock held, so and we can't take the mutex in that codepath. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
35a92fe8 |
|
17-Sep-2015 |
Jeff Layton <jlayton@kernel.org> |
nfsd: serialize state seqid morphing operations Andrew was seeing a race occur when an OPEN and OPEN_DOWNGRADE were running in parallel. The server would receive the OPEN_DOWNGRADE first and check its seqid, but then an OPEN would race in and bump it. The OPEN_DOWNGRADE would then complete and bump the seqid again. The result was that the OPEN_DOWNGRADE would be applied after the OPEN, even though it should have been rejected since the seqid changed. The only recourse we have here I think is to serialize operations that bump the seqid in a stateid, particularly when we're given a seqid in the call. To address this, we add a new rw_semaphore to the nfs4_ol_stateid struct. We do a down_write prior to checking the seqid after looking up the stateid to ensure that nothing else is going to bump it while we're operating on it. In the case of OPEN, we do a down_read, as the call doesn't contain a seqid. Those can run in parallel -- we just need to serialize them when there is a concurrent OPEN_DOWNGRADE or CLOSE. LOCK and LOCKU however always take the write lock as there is no opportunity for parallelizing those. Reported-and-Tested-by: Andrew W Elble <aweits@rit.edu> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
7ba6cad6 |
|
24-Jun-2015 |
Kinglong Mee <kinglongmee@gmail.com> |
nfsd: New helper nfsd4_cb_sequence_done() for processing more cb errors According to Christoph's advice, this patch introduce a new helper nfsd4_cb_sequence_done() for processing more callback errors, following the example of the client's nfs41_sequence_done(). Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
af90f707 |
|
18-Jun-2015 |
Christoph Hellwig <hch@lst.de> |
nfsd: take struct file setup fully into nfs4_preprocess_stateid_op This patch changes nfs4_preprocess_stateid_op so it always returns a valid struct file if it has been asked for that. For that we now allocate a temporary struct file for special stateids, and check permissions if we got the file structure from the stateid. This ensures that all callers will get their handling of special stateids right, and avoids code duplication. There is a little wart in here because the read code needs to know if we allocated a file structure so that it can copy around the read-ahead parameters. In the long run we should probably aim to cache full file structures used with special stateids instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
276f03e3 |
|
02-Jun-2015 |
Kinglong Mee <kinglongmee@gmail.com> |
nfsd: Update callback sequnce id only CB_SEQUENCE success When testing pnfs layout, nfsd got error NFS4ERR_SEQ_MISORDERED. It is caused by nfs return NFS4ERR_DELAY before validate_seqid(), don't update the sequnce id, but nfsd updates the sequnce id !!! According to RFC5661 20.9.3, " If CB_SEQUENCE returns an error, then the state of the slot (sequence ID, cached reply) MUST NOT change. " Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
cba5f62b |
|
30-Apr-2015 |
Christoph Hellwig <hch@lst.de> |
nfsd: fix callback restarts Checking the rpc_client pointer is not a reliable way to detect backchannel changes: cl_cb_client is changed only after shutting down the rpc client, so the condition cl_cb_client = tk_client will always be true. Check the RPC_TASK_KILLED flag instead, and rewrite the code to avoid the buggy cl_callbacks list and fix the lifetime rules due to double calls of the ->prepare callback operations method for this retry case. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
ef2a1b3e |
|
30-Apr-2015 |
Christoph Hellwig <hch@lst.de> |
nfsd: split transport vs operation errors for callbacks We must only increment the sequence id if the client has seen and responded to a request. If we failed to deliver it to the client we must resend with the same sequence id. So just like the client track errors at the transport level differently from those returned in the XDR. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
8287f009 |
|
27-Apr-2015 |
Sachin Bhamare <sachin.bhamare@primarydata.com> |
nfsd: fix pNFS return on close semantics For the sake of forgetful clients, the server should return the layouts to the file system on 'last close' of a file (assuming that there are no delegations outstanding to that particular client) or on delegreturn (assuming that there are no opens on a file from that particular client). In theory the information is all there in current data structures, but it's not efficiently available; nfs4_file->fi_ref includes references on the file across all clients, but we need a per-(client, file) count. Walking through lots of stateid's to calculate this on each close or delegreturn would be painful. This patch introduces infrastructure to maintain per-client opens and delegation counters on a per-file basis. [hch: ported to the mainline pNFS support, merged various fixes from Jeff] Signed-off-by: Sachin Bhamare <sachin.bhamare@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
c5c707f9 |
|
22-Sep-2014 |
Christoph Hellwig <hch@lst.de> |
nfsd: implement pNFS layout recalls Add support to issue layout recalls to clients. For now we only support full-file recalls to get a simple and stable implementation. This allows to embedd a nfsd4_callback structure in the layout_state and thus avoid any memory allocations under spinlocks during a recall. For normal use cases that do not intent to share a single file between multiple clients this implementation is fully sufficient. To ensure layouts are recalled on local filesystem access each layout state registers a new FL_LAYOUT lease with the kernel file locking code, which filesystems that support pNFS exports that require recalls need to break on conflicting access patterns. The XDR code is based on the old pNFS server implementation by Andy Adamson, Benny Halevy, Boaz Harrosh, Dean Hildebrand, Fred Isaman, Marc Eshel, Mike Sager and Ricardo Labiaga. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
9cf514cc |
|
05-May-2014 |
Christoph Hellwig <hch@lst.de> |
nfsd: implement pNFS operations Add support for the GETDEVICEINFO, LAYOUTGET, LAYOUTCOMMIT and LAYOUTRETURN NFSv4.1 operations, as well as backing code to manage outstanding layouts and devices. Layout management is very straight forward, with a nfs4_layout_stateid structure that extends nfs4_stid to manage layout stateids as the top-level structure. It is linked into the nfs4_file and nfs4_client structures like the other stateids, and contains a linked list of layouts that hang of the stateid. The actual layout operations are implemented in layout drivers that are not part of this commit, but will be added later. The worst part of this commit is the management of the pNFS device IDs, which suffers from a specification that is not sanely implementable due to the fact that the device-IDs are global and not bound to an export, and have a small enough size so that we can't store the fsid portion of a file handle, and must never be reused. As we still do need perform all export authentication and validation checks on a device ID passed to GETDEVICEINFO we are caught between a rock and a hard place. To work around this issue we add a new hash that maps from a 64-bit integer to a fsid so that we can look up the export to authenticate against it, a 32-bit integer as a generation that we can bump when changing the device, and a currently unused 32-bit integer that could be used in the future to handle more than a single device per export. Entries in this hash table are never deleted as we can't reuse the ids anyway, and would have a severe lifetime problem anyway as Linux export structures are temporary structures that can go away under load. Parts of the XDR data, structures and marshaling/unmarshaling code, as well as many concepts are derived from the old pNFS server implementation from Andy Adamson, Benny Halevy, Dean Hildebrand, Marc Eshel, Fred Isaman, Mike Sager, Ricardo Labiaga and many others. Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
4d227fca |
|
17-Aug-2014 |
Christoph Hellwig <hch@lst.de> |
nfsd: make find_any_file available outside nfs4state.c Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
e6ba76e1 |
|
14-Aug-2014 |
Christoph Hellwig <hch@lst.de> |
nfsd: make find/get/put file available outside nfs4state.c Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
cd61c522 |
|
14-Aug-2014 |
Christoph Hellwig <hch@lst.de> |
nfsd: make lookup/alloc/unhash_stid available outside nfs4state.c Signed-off-by: Christoph Hellwig <hch@lst.de>
|
#
67db1034 |
|
13-Dec-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: fi_delegees doesn't need to be an atomic_t fi_delegees is always handled under the fi_lock, so there's no need to use an atomic_t for this field. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
5b095e99 |
|
23-Oct-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: convert nfs4_file searches to use RCU The global state_lock protects the file_hashtbl, and that has the potential to be a scalability bottleneck. Address this by making the file_hashtbl use RCU. Add a rcu_head to the nfs4_file and use that when freeing ones that have been hashed. In order to conserve space, we union the fi_rcu field with the fi_delegations list_head which must be clear by the time the last reference to the file is dropped. Convert find_file_locked to use RCU lookup primitives and not to require that the state_lock be held, and convert find_file to do a lockless lookup. Convert find_or_add_file to attempt a lockless lookup first, and then fall back to doing a locked search and insert if that fails to find anything. Also, minimize the number of times we need to calculate the hash value by passing it in as an argument to the search and insert functions, and optimize the order of arguments in nfsd4_init_file. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
ccc6398e |
|
16-Oct-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: clean up comments over nfs4_file definition They're a bit outdated wrt to some recent changes. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
0c637be8 |
|
21-Aug-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: don't keep a pointer to the lease in nfs4_file Now that we don't need to pass in an actual lease pointer to vfs_setlease on unlock, we can stop tracking a pointer to the lease in the nfs4_file. Switch all of the places that check the fi_lease to check fi_deleg_file instead. We always set that at the same time so it will have the same semantics. Cc: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
|
#
34549ab0 |
|
01-Oct-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: eliminate "to_delegation" define We now have cb_to_delegation and to_delegation, which do the same thing and are defined separately in different .c files. Move the cb_to_delegation definition into a header file and eliminate the redundant to_delegation definition. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@primarydata.com>
|
#
0162ac2b |
|
23-Sep-2014 |
Christoph Hellwig <hch@lst.de> |
nfsd: introduce nfsd4_callback_ops Add a higher level abstraction than the rpc_ops for callback operations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f0b5de1b |
|
23-Sep-2014 |
Christoph Hellwig <hch@lst.de> |
nfsd: split nfsd4_callback initialization and use Split out initializing the nfs4_callback structure from using it. For the NULL callback this gets rid of tons of pointless re-initializations. Note that I don't quite understand what protects us from running multiple NULL callbacks at the same time, but at least this chance doesn't make it worse.. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
326129d0 |
|
23-Sep-2014 |
Christoph Hellwig <hch@lst.de> |
nfsd: introduce a generic nfsd4_cb Add a helper to queue up a callback. CB_NULL has a bit of special casing because it is special in the specification, but all other new callback operations will be able to share code with this and a few more changes to refactor the callback code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
2faf3b43 |
|
23-Sep-2014 |
Christoph Hellwig <hch@lst.de> |
nfsd: remove nfsd4_callback.cb_op We can always get at the private data by using container_of, no need for a void pointer. Also introduce a little to_delegation helper to avoid opencoding the container_of everywhere. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
d682e750 |
|
12-Sep-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: serialize nfsdcltrack upcalls for a particular client In a later patch, we want to add a flag that will allow us to reduce the need for upcalls. In order to handle that correctly, we'll need to ensure that racing upcalls for the same client can't occur. In practice it should be rare for this to occur with a well-behaved client, but it is possible. Convert one of the bits in the cl_flags field to be an upcall bitlock, and use it to ensure that upcalls for the same client are serialized. Signed-off-by: Jeff Layton <jlayton@primarydata.com>
|
#
7f5ef2e9 |
|
12-Sep-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add a v4_end_grace file to /proc/fs/nfsd Allow a privileged userland process to end the v4 grace period early. Writing "Y", "y", or "1" to the file will cause the v4 grace period to be lifted. The basic idea with this will be to allow the userland client tracking program to lift the grace period once it knows that no more clients will be reclaiming state. Signed-off-by: Jeff Layton <jlayton@primarydata.com>
|
#
919b8049 |
|
12-Sep-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: remove redundant boot_time parm from grace_done client tracking op Since it's stored in nfsd_net, we don't need to pass it in separately. Signed-off-by: Jeff Layton <jlayton@primarydata.com>
|
#
14a571a8 |
|
05-Aug-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add some comments to the nfsd4 object definitions Add some comments that describe what each of these objects is, and how they related to one another. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
b687f686 |
|
30-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: remove the client_mutex and the nfs4_lock/unlock_state wrappers Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
285abdee |
|
30-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: remove old fault injection infrastructure Remove the old nfsd_for_n_state function and move nfsd_find_client higher up into the file to get rid of forward declaration. Remove the struct nfsd_fault_inject_op arguments from the operations as they are no longer needed by any of them. Finally, remove the old "standard" get and set routines, which also eliminates the client_mutex from this code. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
98d5c7c5 |
|
30-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add more granular locking to *_delegations fault injectors ...instead of relying on the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
82e05efa |
|
30-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add more granular locking to forget_openowners fault injector ...instead of relying on the client_mutex. Also, fix up the printk output that is generated when the file is read. It currently says that it's reporting the number of open files, but it's actually reporting the number of openowners. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
016200c3 |
|
30-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add more granular locking to forget_locks fault injector ...instead of relying on the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
69fc9edf |
|
30-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add nfsd_inject_forget_clients ...which uses the client_lock for protection instead of client_mutex. Also remove nfsd_forget_client as there are no more callers. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
a0926d15 |
|
30-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add a forget_client set_clnt routine ...that relies on the client_lock instead of client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
7ec0e36f |
|
30-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add a forget_clients "get" routine with proper locking Add a new "get" routine for forget_clients that relies on the client_lock instead of the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
83e452fe |
|
31-Jul-2014 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: fix out of date comment Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
d4f0489f |
|
29-Jul-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Move the open owner hash table into struct nfs4_client Preparation for removing the client_mutex. Convert the open owner hash table into a per-client table and protect it using the nfs4_client->cl_lock spin lock. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
d3134b10 |
|
29-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: make openstateids hold references to their openowners Change it so that only openstateids hold persistent references to openowners. References can still be held by compounds in progress. With this, we can get rid of NFS4_OO_NEW. It's possible that we will create a new openowner in the process of doing the open, but something later fails. In the meantime, another task could find that openowner and start using it on a successful open. If that occurs we don't necessarily want to tear it down, just put the reference that the failing compound holds. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
8f4b54c5 |
|
29-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add an operation for unhashing a stateowner Allow stateowners to be unhashed and destroyed when the last reference is put. The unhashing must be idempotent. In a future patch, we'll add some locking around it, but for now it's only protected by the client_mutex. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
58fb12e6 |
|
29-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: Add a mutex to protect the NFSv4.0 open owner replay cache We don't want to rely on the client_mutex for protection in the case of NFSv4 open owners. Instead, we add a mutex that will only be taken for NFSv4.0 state mutating operations, and that will be released once the entire compound is done. Also, ensure that nfsd4_cstate_assign_replay/nfsd4_cstate_clear_replay take a reference to the stateowner when they are using it for NFSv4.0 open and lock replay caching. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
6b180f0b |
|
29-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: Add reference counting to state owners The way stateowners are managed today is somewhat awkward. They need to be explicitly destroyed, even though the stateids reference them. This will be particularly problematic when we remove the client_mutex. We may create a new stateowner and attempt to open a file or set a lock, and have that fail. In the meantime, another RPC may come in that uses that same stateowner and succeed. We can't have the first task tearing down the stateowner in that situation. To fix this, we need to change how stateowners are tracked altogether. Refcount them and only destroy them once all stateids that reference them have been destroyed. This patch starts by adding the refcounting necessary to do that. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
11b9164a |
|
29-Jul-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Add a struct nfs4_file field to struct nfs4_stid All stateids are associated with a nfs4_file. Let's consolidate. Replace delegation->dl_file with the dl_stid.sc_file, and nfs4_ol_stateid->st_file with st_stid.sc_file. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
6011695d |
|
29-Jul-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Add reference counting to the lock and open stateids When we remove the client_mutex, we'll need to be able to ensure that these objects aren't destroyed while we're not holding locks. Add a ->free() callback to the struct nfs4_stid, so that we can release a reference to the stid without caring about the contents. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
650ecc8f |
|
25-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: remove dl_fh field from struct nfs4_delegation Now that the nfs4_file has a filehandle in it, we no longer need to keep a per-delegation copy of it. Switch to using the one in the nfs4_file instead. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f9c00c3a |
|
23-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: Do not let nfs4_file pin the struct inode Remove the fi_inode field in struct nfs4_file in order to remove the possibility of struct nfs4_file pinning the inode when it does not have any open state. The only place we still need to get to an inode is in check_for_locks, so change it to use find_any_file and use the inode from any that it finds. If it doesn't find one, then just assume there aren't any. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
e2cf80d7 |
|
23-Jul-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Store the filehandle with the struct nfs4_file For use when we may not have a struct inode. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
72c0b0fb |
|
21-Jul-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Move the delegation reference counter into the struct nfs4_stid We will want to add reference counting to the lock stateid and open stateids too in later patches. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
b0fc29d6 |
|
16-Jul-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Ensure stateids remain unique until they are freed Add an extra delegation state to allow the stateid to remain in the idr tree until the last reference has been released. This will be necessary to ensure uniqueness once the client_mutex is removed. [jlayton: reset the sc_type under the state_lock in unhash_delegation] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
02e1215f |
|
16-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: Avoid taking state_lock while holding inode lock in nfsd_break_one_deleg state_lock is a heavily contended global lock. We don't want to grab that while simultaneously holding the inode->i_lock. Add a new per-nfs4_file lock that we can use to protect the per-nfs4_file delegation list. Hold that while walking the list in the break_deleg callback and queue the workqueue job for each one. The workqueue job can then take the state_lock and do the list manipulations without the i_lock being held prior to starting the rpc call. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
e8051c83 |
|
16-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: eliminate nfsd4_init_callback It's just an obfuscated INIT_WORK call. Just make the work_func_t a non-static symbol and use a normal INIT_WORK call. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
baeb4ff0 |
|
10-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: make deny mode enforcement more efficient and close races in it The current enforcement of deny modes is both inefficient and scattered across several places, which makes it hard to guarantee atomicity. The inefficiency is a problem now, and the lack of atomicity will mean races once the client_mutex is removed. First, we address the inefficiency. We have to track deny modes on a per-stateid basis to ensure that open downgrades are sane, but when the server goes to enforce them it has to walk the entire list of stateids and check against each one. Instead of doing that, maintain a per-nfs4_file deny mode. When a file is opened, we simply set any deny bits in that mode that were specified in the OPEN call. We can then use that unified deny mode to do a simple check to see whether there are any conflicts without needing to walk the entire stateid list. The only time we'll need to walk the entire list of stateids is when a stateid that has a deny mode on it is being released, or one is having its deny mode downgraded. In that case, we must walk the entire list and recalculate the fi_share_deny field. Since deny modes are pretty rare today, this should be very rare under normal workloads. To address the potential for races once the client_mutex is removed, protect fi_share_deny with the fi_lock. In nfs4_get_vfs_file, check to make sure that any deny mode we want to apply won't conflict with existing access. If that's ok, then have nfs4_file_get_access check that new access to the file won't conflict with existing deny modes. If that also passes, then get file access references, set the correct access and deny bits in the stateid, and update the fi_share_deny field. If opening the file or truncating it fails, then unwind the whole mess and return the appropriate error. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
c11c591f |
|
10-Jul-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: shrink st_access_bmap and st_deny_bmap We never use anything above bit #3, so an unsigned long for each is wasteful. Shrink them to a char each, and add some WARN_ON_ONCE calls if we try to set or clear bits that would go outside those sizes. Note too that because atomic bitops work on unsigned longs, we have to abandon their use here. That shouldn't be a problem though since we don't really care about the atomicity in this code anyway. Using them was just a convenient way to flip bits. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
de18643d |
|
10-Jul-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Add locking to the nfs4_file->fi_fds[] array Preparation for removal of the client_mutex, which currently protects this array. While we don't actually need the find_*_file_locked variants just yet, a later patch will. So go ahead and add them now to reduce future churn in this code. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
1d31a253 |
|
10-Jul-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Add fine grained protection for the nfs4_file->fi_stateids list Access to this list is currently serialized by the client_mutex. Add finer grained locking around this list in preparation for its removal. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
0fe492db |
|
30-Jun-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Convert nfs4_check_open_reclaim() to work with lookup_clientid() lookup_clientid is preferable to find_confirmed_client since it's able to use the cached client in the compound state. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
d4e19e70 |
|
30-Jun-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Don't get a session reference without a client reference If the client were to disappear from underneath us while we're holding a session reference, things would be bad. This cleanup helps ensure that it cannot, which will be a possibility when the client_mutex is removed. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
fd44907c |
|
30-Jun-2014 |
Jeff Layton <jlayton@kernel.org> |
nfsd: clean up nfsd4_release_lockowner Now that we know that we won't have several lockowners with the same, owner->data, we can simplify nfsd4_release_lockowner and get rid of the lo_list in the process. Signed-off-by: Jeff Layton <jlayton@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
b3c32bcd |
|
30-Jun-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: NFSv4 lock-owners are not associated to a specific file Just like open-owners, lock-owners are associated with a name, a clientid and, in the case of minor version 0, a sequence id. There is no association to a file. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
3c87b9b7 |
|
30-Jun-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: lock owners are not per open stateid In the NFSv4 spec, lock stateids are per-file objects. Lockowners are not. This patch replaces the current list of lock owners in the open stateids with a list of lock stateids. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
b607664e |
|
30-Jun-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
nfsd: Cleanup nfs4svc_encode_compoundres Move the slot return, put session etc into a helper in fs/nfsd/nfs4state.c instead of open coding in nfs4svc_encode_compoundres. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
24906f32 |
|
12-Mar-2014 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: allow larger 4.1 session drc slots The client is actually asking for 2532 bytes. I suspect that's a mistake. But maybe we can allow some more. In theory lock needs more if it might return a maximum-length lockowner in the denied case. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
50cc6231 |
|
18-Apr-2014 |
Trond Myklebust <trond.myklebust@primarydata.com> |
NFSd: Mark nfs4_free_lockowner and nfs4_free_openowner as static functions They do not need to be used outside fs/nfsd/nfs4state.c Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
9c69de4c |
|
06-May-2014 |
Christoph Hellwig <hch@lst.de> |
nfsd: remove <linux/nfsd/nfsfh.h> The only real user of this header is fs/nfsd/nfsfh.h, so merge the two. Various lockѕ source files used it to indirectly get other sunrpc or nfs headers, so fix those up. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
57266a6e |
|
13-Apr-2013 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: implement minimal SP4_MACH_CRED Do a minimal SP4_MACH_CRED implementation suggested by Trond, ignoring the client-provided spo_must_* arrays and just enforcing credential checks for the minimum required operations. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
3bd64a5b |
|
09-Apr-2013 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: implement SEQ4_STATUS_RECALLABLE_STATE_REVOKED A 4.1 server must notify a client that has had any state revoked using the SEQ4_STATUS_RECALLABLE_STATE_REVOKED flag. The client can figure out exactly which state is the problem using CHECK_STATEID and then free it using FREE_STATEID. The status flag will be unset once all such revoked stateids are freed. Our server's only recallable state is delegations. So we keep with each 4.1 client a list of delegations that have timed out and been recalled, but haven't yet been freed by FREE_STATEID. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
9411b1d4 |
|
01-Apr-2013 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: cleanup handling of nfsv4.0 closed stateid's Closed stateid's are kept around a little while to handle close replays in the 4.0 case. So we stash them in the last-used stateid in the oo_last_closed_stateid field of the open owner. We can free that in encode_seqid_op_tail once the seqid on the open owner is next incremented. But we don't want to do that on the close itself; so we set NFS4_OO_PURGE_CLOSE flag set on the open owner, skip freeing it the first time through encode_seqid_op_tail, then when we see that flag set next time we free it. This is unnecessarily baroque. Instead, just move the logic that increments the seqid out of the xdr code and into the operation code itself. The justification given for the current placement is that we need to wait till the last minute to be sure we know whether the status is a sequence-id-mutating error or not, but examination of the code shows that can't actually happen. Reported-by: Yanchuan Nian <ycnian@gmail.com> Tested-by: Yanchuan Nian <ycnian@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
89876f8c |
|
02-Apr-2013 |
Jeff Layton <jlayton@kernel.org> |
nfsd: convert the file_hashtbl to a hlist We only ever traverse the hash chains in the forward direction, so a double pointer list head isn't really necessary. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
66b2b9b2 |
|
18-Mar-2013 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: don't destroy in-use session This changes session destruction to be similar to client destruction in that attempts to destroy a session while in use (which should be rare corner cases) result in DELAY. This simplifies things somewhat and helps meet a coming 4.2 requirement. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
221a6876 |
|
01-Apr-2013 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: don't destroy in-use clients When a setclientid_confirm or create_session confirms a client after a client reboot, it also destroys any previous state held by that client. The shutdown of that previous state must be careful not to free the client out from under threads processing other requests that refer to the client. This is a particular problem in the NFSv4.1 case when we hold a reference to a session (hence a client) throughout compound processing. The server attempts to handle this by unhashing the client at the time it's destroyed, then delaying the final free to the end. But this still leaves some races in the current code. I believe it's simpler just to fail the attempt to destroy the client by returning NFS4ERR_DELAY. This is a case that should never happen anyway. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
b0a9d3ab |
|
07-Mar-2013 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: fix race on client shutdown Dropping the session's reference count after the client's means we leave a window where the session's se_client pointer is NULL. An xpt_user callback that encounters such a session may then crash: [ 303.956011] BUG: unable to handle kernel NULL pointer dereference at 0000000000000318 [ 303.959061] IP: [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40 [ 303.959061] PGD 37811067 PUD 3d498067 PMD 0 [ 303.959061] Oops: 0002 [#8] PREEMPT SMP [ 303.959061] Modules linked in: md5 nfsd auth_rpcgss nfs_acl snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc microcode psmouse snd_timer serio_raw pcspkr evdev snd soundcore i2c_piix4 i2c_core intel_agp intel_gtt processor button nfs lockd sunrpc fscache ata_generic pata_acpi ata_piix uhci_hcd libata btrfs usbcore usb_common crc32c scsi_mod libcrc32c zlib_deflate floppy virtio_balloon virtio_net virtio_pci virtio_blk virtio_ring virtio [ 303.959061] CPU 0 [ 303.959061] Pid: 264, comm: nfsd Tainted: G D 3.8.0-ARCH+ #156 Bochs Bochs [ 303.959061] RIP: 0010:[<ffffffff81481a8e>] [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40 [ 303.959061] RSP: 0018:ffff880037877dd8 EFLAGS: 00010202 [ 303.959061] RAX: 0000000000000100 RBX: ffff880037a2b698 RCX: ffff88003d879278 [ 303.959061] RDX: ffff88003d879278 RSI: dead000000100100 RDI: 0000000000000318 [ 303.959061] RBP: ffff880037877dd8 R08: ffff88003c5a0f00 R09: 0000000000000002 [ 303.959061] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 303.959061] R13: 0000000000000318 R14: ffff880037a2b680 R15: ffff88003c1cbe00 [ 303.959061] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000 [ 303.959061] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 303.959061] CR2: 0000000000000318 CR3: 000000003d49c000 CR4: 00000000000006f0 [ 303.959061] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 303.959061] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 303.959061] Process nfsd (pid: 264, threadinfo ffff880037876000, task ffff88003c1fd0a0) [ 303.959061] Stack: [ 303.959061] ffff880037877e08 ffffffffa03772ec ffff88003d879000 ffff88003d879278 [ 303.959061] ffff88003d879080 0000000000000000 ffff880037877e38 ffffffffa0222a1f [ 303.959061] 0000000000107ac0 ffff88003c22e000 ffff88003d879000 ffff88003c1cbe00 [ 303.959061] Call Trace: [ 303.959061] [<ffffffffa03772ec>] nfsd4_conn_lost+0x3c/0xa0 [nfsd] [ 303.959061] [<ffffffffa0222a1f>] svc_delete_xprt+0x10f/0x180 [sunrpc] [ 303.959061] [<ffffffffa0223d96>] svc_recv+0xe6/0x580 [sunrpc] [ 303.959061] [<ffffffffa03587c5>] nfsd+0xb5/0x140 [nfsd] [ 303.959061] [<ffffffffa0358710>] ? nfsd_destroy+0x90/0x90 [nfsd] [ 303.959061] [<ffffffff8107ae00>] kthread+0xc0/0xd0 [ 303.959061] [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pte_at+0x50/0x100 [ 303.959061] [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70 [ 303.959061] [<ffffffff814898ec>] ret_from_fork+0x7c/0xb0 [ 303.959061] [<ffffffff8107ad40>] ? kthread_freezable_should_stop+0x70/0x70 [ 303.959061] Code: ff ff 5d c3 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 65 48 8b 04 25 f0 c6 00 00 48 89 e5 83 80 44 e0 ff ff 01 b8 00 01 00 00 <3e> 66 0f c1 07 0f b6 d4 38 c2 74 0f 66 0f 1f 44 00 00 f3 90 0f [ 303.959061] RIP [<ffffffff81481a8e>] _raw_spin_lock+0x1e/0x40 [ 303.959061] RSP <ffff880037877dd8> [ 303.959061] CR2: 0000000000000318 [ 304.001218] ---[ end trace 2d809cd4a7931f5a ]--- [ 304.001903] note: nfsd[264] exited with preempt_count 2 Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
03bc6d1c |
|
02-Feb-2013 |
Eric W. Biederman <ebiederm@xmission.com> |
nfsd: Modify nfsd4_cb_sec to use kuids and kgids Change uid and gid in struct nfsd4_cb_sec to be of type kuid_t and kgid_t. In nfsd4_decode_cb_sec when reading uids and gids off the wire convert them to kuids and kgids, and if they don't convert to valid kuids or valid kuids ignore RPC_AUTH_UNIX and don't fill in any of the fields. Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
#
6c1e82a4 |
|
29-Nov-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFSD: Forget state for a specific client Write the client's ip address to any state file and all appropriate state for that client will be forgotten. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
184c1847 |
|
29-Nov-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFSD: Reading a fault injection file prints a state count I also log basic information that I can figure out about the type of state (such as number of locks for each client IP address). This can be useful for checking that state was actually dropped and later for checking if the client was able to recover. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
8ce54e0d |
|
29-Nov-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFSD: Fault injection operations take a per-client forget function The eventual goal is to forget state based on ip address, so it makes sense to call this function in a for-each-client loop until the correct amount of state is forgotten. I also use this patch as an opportunity to rename the forget function from "func()" to "forget()". Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f3c7521f |
|
27-Nov-2012 |
Bryan Schumaker <bjschuma@netapp.com> |
NFSD: Fold fault_inject.h into state.h There were only a small number of functions in this file and since they all affect stored state I think it makes sense to put them in state.h instead. I also dropped most static inline declarations since there are no callers when fault injection is not enabled. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
12760c66 |
|
14-Nov-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
nfsd: pass nfsd_net instead of net to grace enders Passing net context looks as overkill. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
3320fef19 |
|
14-Nov-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
nfsd: use service net instead of hard-coded init_net This patch replaces init_net by SVC_NET(), where possible and also passes proper context to nested functions where required. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
52e19c09 |
|
14-Nov-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
nfsd: make reclaim_str_hashtbl allocated per net This hash holds nfs4_clients info, which are network namespace aware. So let's make it allocated per network namespace. Note: this hash is used only by legacy tracker. So let's allocate hash in tracker init. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
c212cecf |
|
14-Nov-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
nfsd: make nfs4_client network namespace dependent And use it's net where possible. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
2216d449 |
|
12-Nov-2012 |
Jeff Layton <jlayton@kernel.org> |
nfsd: get rid of cl_recdir field Remove the cl_recdir field from the nfs4_client struct. Instead, just compute it on the fly when and if it's needed, which is now only when the legacy client tracking code is in effect. The error handling in the legacy client tracker is also changed to handle the case where md5 is unavailable. In that case, we'll warn the admin with a KERN_ERR message and disable the client tracking. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
ac55fdc4 |
|
12-Nov-2012 |
Jeff Layton <jlayton@kernel.org> |
nfsd: move the confirmed and unconfirmed hlists to a rbtree The current code requires that we md5 hash the name in order to store the client in the confirmed and unconfirmed trees. Change it instead to store the clients in a pair of rbtrees, and simply compare the cl_names directly instead of hashing them. This also necessitates that we add a new flag to the clp->cl_flags field to indicate which tree the client is currently in. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
0ce0c2b5 |
|
12-Nov-2012 |
Jeff Layton <jlayton@kernel.org> |
nfsd: don't search for client by hash on legacy reboot recovery gracedone When nfsd starts, the legacy reboot recovery code creates a tracking struct for each directory in the v4recoverydir. When the grace period ends, it basically does a "readdir" on the directory again, and matches each dentry in there to an existing client id to see if it should be removed or not. If the matching client doesn't exist, or hasn't reclaimed its state then it will remove that dentry. This is pretty inefficient since it involves doing a lot of hash-bucket searching. It also means that we have to keep relying on being able to search for a nfs4_client by md5 hashed cl_recdir name. Instead, add a pointer to the nfs4_client that indicates the association between the nfs4_client_reclaim and nfs4_client. When a reclaim operation comes in, we set the pointer to make that association. On gracedone, the legacy client tracker will keep the recdir around iff: 1/ there is a reclaim record for the directory ...and... 2/ there's an association between the reclaim record and a client record -- that is, a create or check operation was performed on the client that matches that directory. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
772a9bbb |
|
12-Nov-2012 |
Jeff Layton <jlayton@kernel.org> |
nfsd: make nfs4_client_to_reclaim return a pointer to the reclaim record Later callers will need to make changes to the record. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
ce30e539 |
|
12-Nov-2012 |
Jeff Layton <jlayton@kernel.org> |
nfsd: break out reclaim record removal into separate function We'll need to be able to call this from nfs4recover.c eventually. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
278c931c |
|
12-Nov-2012 |
Jeff Layton <jlayton@kernel.org> |
nfsd: have nfsd4_find_reclaim_client take a char * argument Currently, it takes a client pointer, but later we're going to need to search for these records without knowing whether a matching client even exists. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
a0af710a |
|
09-Nov-2012 |
Jeff Layton <jlayton@kernel.org> |
nfsd: remove unused argument to nfs4_has_reclaimed_state Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
12fc3e92 |
|
05-Nov-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: backchannel should use client-provided security flavor For now this only adds support for AUTH_NULL. (Previously we assumed AUTH_UNIX.) We'll also need AUTH_GSS, which is trickier. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
57725155 |
|
05-Nov-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: common helper to initialize callback work I've found it confusing having the only references to nfsd4_do_callback_rpc() in a different file. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
cb73a9f4 |
|
01-Nov-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: implement backchannel_ctl operation This operation is mandatory for servers to implement. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
c6bb3ca2 |
|
01-Nov-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: use callback security parameters in create_session We're currently ignoring the callback security parameters specified in create_session, and just assuming the client wants auth_sys, because that's all the current linux client happens to care about. But this could cause us callbacks to fail to a client that wanted something different. For now, all we're doing is no longer ignoring the uid and gid passed in the auth_sys case. Further patches will add support for auth_null and gss (and possibly use more of the auth_sys information; the spec wants us to use exactly the credential we're passed, though it's hard to imagine why a client would care). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
acb2887e |
|
27-Mar-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: clean up callback security parsing Move the callback parsing into a separate function. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
d15c077e |
|
13-Sep-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: enforce per-client sessions/no-sessions distinction Something like creating a client with setclientid and then trying to confirm it with create_session may not crash the server, but I'm not completely positive of that, and in any case it's obviously bad client behavior. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
1696c47c |
|
06-Aug-2012 |
Jeff Layton <jlayton@kernel.org> |
nfsd: trivial comment updates locks.c doesn't use the BKL anymore and there is no fi_perfile field. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
39307655 |
|
16-Aug-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: fix security flavor of NFSv4.0 callback Commit d5497fc693a446ce9100fcf4117c3f795ddfd0d2 "nfsd4: move rq_flavor into svc_cred" forgot to remove cl_flavor from the client, leaving two places (cl_flavor and cl_cred.cr_flavor) for the flavor to be stored. After that patch, the latter was the one that was updated, but the former was the one that the callback used. Symptoms were a long delay on utime(). This is because the utime() generated a setattr which recalled a delegation, but the cb_recall was ignored by the client because it had the wrong security flavor. Cc: stable@vger.kernel.org Tested-by: Jamie Heilman <jamie@audible.transient.net> Reported-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
2c142baa |
|
25-Jul-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
NFSd: make boot_time variable per network namespace NFSd's boot_time represents grace period start point in time. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
5ccb0066 |
|
25-Jul-2012 |
Stanislav Kinsbursky <skinsbursky@parallels.com> |
LockD: pass actual network namespace to grace period management functions Passed network namespace replaced hard-coded init_net Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
7df302f7 |
|
29-May-2012 |
Chuck Lever <chuck.lever@oracle.com> |
NFSD: TEST_STATEID should not return NFS4ERR_STALE_STATEID According to RFC 5661, the TEST_STATEID operation is not allowed to return NFS4ERR_STALE_STATEID. In addition, RFC 5661 says: 15.1.16.5. NFS4ERR_STALE_STATEID (Error Code 10023) A stateid generated by an earlier server instance was used. This error is moot in NFSv4.1 because all operations that take a stateid MUST be preceded by the SEQUENCE operation, and the earlier server instance is detected by the session infrastructure that supports SEQUENCE. I triggered NFS4ERR_STALE_STATEID while testing the Linux client's NOGRACE recovery. Bruce suggested an additional test that could be useful to client developers. Lastly, RFC 5661, section 18.48.3 has this: o Special stateids are always considered invalid (they result in the error code NFS4ERR_BAD_STATEID). An explicit check is made for those state IDs to avoid printk noise. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
03a4e1f6 |
|
14-May-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: move principal name into svc_cred Instead of keeping the principal name associated with a request in a structure that's private to auth_gss and using an accessor function, move it to svc_cred. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
2a4317c5 |
|
21-Mar-2012 |
Jeff Layton <jlayton@kernel.org> |
nfsd: add nfsd4_client_tracking_ops struct and a way to set it Abstract out the mechanism that we use to track clients into a set of client name tracking functions. This gives us a mechanism to plug in a new set of client tracking functions without disturbing the callers. It also gives us a way to decide on what tracking scheme to use at runtime. For now, this just looks like pointless abstraction, but later we'll add a new alternate scheme for tracking clients on stable storage. Note too that this patch anticipates the eventual containerization of this code by passing in struct net pointers in places. No attempt is made to containerize the legacy client tracker however. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
a52d726b |
|
21-Mar-2012 |
Jeff Layton <jlayton@kernel.org> |
nfsd: convert nfs4_client->cl_cb_flags to a generic flags field We'll need a way to flag the nfs4_client as already being recorded on stable storage so that we don't continually upcall. Currently, that's recorded in the cl_firststate field of the client struct. Using an entire u32 to store a flag is rather wasteful though. The cl_cb_flags field is only using 2 bits right now, so repurpose that to a generic flags field. Rename NFSD4_CLIENT_KILL to NFSD4_CLIENT_CB_KILL to make it evident that it's part of the callback flags. Add a mask that we can use for existing checks that look to see whether any flags are set, so that the new flags don't interfere. Convert all references to cl_firstate to the NFSD4_CLIENT_STABLE flag, and add a new NFSD4_CLIENT_RECLAIM_COMPLETE flag. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
508dc6e1 |
|
23-Feb-2012 |
Benny Halevy <bhalevy@tonian.com> |
nfsd41: free_session/free_client must be called under the client_lock The session client is manipulated under the client_lock hence both free_session and nfsd4_del_conns must be called under this lock. This patch adds a BUG_ON that checks this condition in the respective functions and implements the missing locks. nfsd4_{get,put}_session helpers were moved to the C file that uses them so to prevent use from external files and an unlocked version of nfsd4_put_session is provided for external use from nfs4xdr.c Signed-off-by: Benny Halevy <bhalevy@tonian.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
bf5c43c8 |
|
13-Feb-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: check for uninitialized slot This fixes an oops when a buggy client tries to use an initial seqid of 0 on a new slot, which we may misinterpret as a replay. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
73e79482 |
|
13-Feb-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: rearrange struct nfsd4_slot Combine two booleans into a single flag field, move the smaller fields to the end. (In practice this doesn't make the struct any smaller. But we'll be adding another flag here soon.) Remove some debugging code that doesn't look useful, while we're in the neighborhood. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
7a6ef8c7 |
|
05-Jan-2012 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: nfsd4_create_clid_dir return value is unused Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
009673b4 |
|
07-Nov-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: add a separate (lockowner, inode) lookup Address the possible performance regression mentioned in "nfsd4: hash lockowners to simplify RELEASE_LOCKOWNER" by providing a separate (lockowner, inode) hash. Really, I doubt this matters much, but I think it's likely we'll change these data structures here and I'd rather that the need for (owner, inode) lookups be well-documented. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
5423732a |
|
19-Oct-2011 |
Benny Halevy <bhalevy@panasas.com> |
nfsd41: use SEQ4_STATUS_BACKCHANNEL_FAULT when cb_sequence is invalid Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
996e0938 |
|
17-Oct-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: do idr preallocation with stateid allocation Move idr preallocation out of stateid initialization, into stateid allocation, so that we no longer have to handle any errors from the former. This is a little subtle due to the way the idr code manages these preallocated items--document that in comments. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
d29b20cd |
|
13-Oct-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: clean up open owners on OPEN failure If process_open1() creates a new open owner, but the open later fails, the current code will leave the open owner around. It won't be on the close_lru list, and the client isn't expected to send a CLOSE, so it will hang around as long as the client does. Similarly, if process_open1() removes an existing open owner from the close lru, anticipating that an open owner that previously had no associated stateid's now will, but the open subsequently fails, then we'll again be left with the same leak. Fix both problems. Reported-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
3557e43b |
|
12-Oct-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: make is_open_owner boolean Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
b31b30e5 |
|
28-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: cleanup state.h comments These comments are mostly out of date. Reported-by: Bryan Schumaker <bjschuma@netapp.com>
|
#
6409a5a6 |
|
28-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: clean up downgrading code In response to some review comments, get rid of the somewhat obscure for-loop with bitops, and improve a comment. Reported-by: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
38c2f4b1 |
|
23-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: look up stateid's per clientid Use a separate stateid idr per client, and lookup a stateid by first finding the client, then looking up the stateid relative to that client. Also some minor refactoring. This allows us to improve error returns: we can return expired when the clientid is not found and bad_stateid when the clientid is found but not the stateid, as opposed to returning expired for both cases. I hope this will also help to replace the state lock mostly by a per-client lock, but that hasn't been done yet. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
36279ac1 |
|
25-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: assume test_stateid always has session Test_stateid is 4.1-only and only allowed after a sequence operation, so this check is unnecessary. Cc: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
6136d2b4 |
|
23-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: use idr for stateid's The idr system is designed exactly for generating id and looking up integer id's. Thanks to Trond for pointing it out. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
2a74aba7 |
|
23-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: move client * to nfs4_stateid, add init_stid helper This will be convenient. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f7a4d872 |
|
16-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: hash closed stateid's like any other Look up closed stateid's in the stateid hash like any other stateid rather than searching the close lru. This is simpler, and fixes a bug: currently we handle only the case of a close that is the last close for a given stateowner, but not the case of a close for a stateowner that still has active opens on other files. Thus in a case like: open(owner, file1) open(owner, file2) close(owner, file2) close(owner, file2) the final close won't be recognized as a retransmission. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
d3b313a4 |
|
15-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: construct stateid from clientid and counter Including the full clientid in the on-the-wire stateid allows more reliable detection of bad vs. expired stateid's, simplifies code, and ensures we won't reuse the opaque part of the stateid (as we currently do when the same openowner closes and reopens the same file). Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
38c387b5 |
|
16-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: match close replays on stateid, not open owner id Keep around an unhashed copy of the final stateid after the last close using an openowner, and when identifying a replay, match against that stateid instead of just against the open owner id. Free it the next time the seqid is bumped or the stateowner is destroyed. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
dad1c067 |
|
11-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: replace oo_confirmed by flag bit I want at least one more bit here. So, let's haul out the caps lock key and add a flags field. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f459e453 |
|
09-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: hash deleg stateid's like any other It's simpler to look up delegation stateid's in the same hash table as any other stateid. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
d5477a8d |
|
07-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: add common dl_stid field to delegation Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
dcef0413 |
|
07-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: move some of nfs4_stateid into a separate structure We want delegations to share more with open/lock stateid's, so first we'll pull out some of the common stuff we want to share. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
2288d0e3 |
|
06-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: pass around typemask instead of flags We're only using those flags to choose lock or open stateid's at this point. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
c0a5d93e |
|
06-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: split preprocess_seqid, cleanup Move most of this into helper functions. Also move the non-CONFIRM case into caller, providing a helper function for that purpose. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
fe0750e5 |
|
30-Jul-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: split stateowners into open and lockowners The stateowner has some fields that only make sense for openowners, and some that only make sense for lockowners, and I find it a lot clearer if those are separated out. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f4dee24c |
|
02-Sep-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: move CLOSE_STATE special case to caller Move the CLOSE_STATE case into the unique caller that cares about it rather than putting it in preprocess_seqid_op. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
7c13f344 |
|
30-Aug-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: drop most stateowner refcounting Maybe we'll bring it back some day, but we don't have much real use for it now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
81b82965 |
|
23-Aug-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: simplify stateid generation code, fix wraparound Follow the recommendation from rfc3530bis for stateid generation number wraparound, simplify some code, and fix or remove incorrect comments. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
5fa0bbb4 |
|
31-Aug-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: simplify distinguishing lock & open stateid's The trick free_stateid is using is a little cheesy, and we'll have more uses for this field later. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
c2d8eb7a |
|
29-Aug-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: remove typoed replay field Wow, I wonder how long that typo's been there. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
28dde241 |
|
22-Aug-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: remove HAS_SESSION This flag doesn't really buy us anything. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
48483bf2 |
|
26-Aug-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: simplify recovery dir setting Move around some of this code, simplify a bit. Reviewed-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
57616300 |
|
10-Aug-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: fix seqid_mutating_error The set of errors here does *not* agree with the set of errors specified in the rfc! While we're there, turn this macros into a function, for the usual reasons, and move it to the one place where it's actually used. Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
17456804 |
|
13-Jul-2011 |
Bryan Schumaker <bjschuma@netapp.com> |
NFSD: Added TEST_STATEID operation This operation is used by the client to check the validity of a list of stateids. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
9ae78bcc |
|
16-Mar-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: fix comment and remove unused nfsd4_file fields A couple fields here were left over from a previous version of a patch, and are no longer used. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
acfdf5c3 |
|
31-Jan-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: acquire only one lease per file Instead of acquiring one lease each time another client opens a file, nfsd can acquire just one lease to represent all of them, and reference count it to determine when to release it. This fixes a regression introduced by c45821d263a8a5109d69a9e8942b8d65bcd5f31a "locks: eliminate fl_mylease callback": after that patch, only the struct file * is used to determine who owns a given lease. But since we recently converted the server to share a single struct file per open, if we acquire multiple leases on the same file from nfsd, it then becomes impossible on unlocking a lease to determine which of those leases (all of whom share the same struct file *) we meant to remove. Thanks to Takashi Iwai <tiwai@suse.de> for catching a bug in a previous version of this patch. Tested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
5ce8ba25 |
|
10-Jan-2011 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: allow restarting callbacks If we lose the backchannel and then the client repairs the problem, resend any callbacks. We use a new cb_done flag to track whether there is still work to be done for the callback or whether it can be destroyed with the rpc. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
84f5f7cc |
|
09-Dec-2010 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: make sure sequence flags are set after destroy_session If this loses any backchannel, make sure we have a chance to notice that and set the sequence flags. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
77a3569d |
|
30-Apr-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: keep finer-grained callback status Distinguish between when the callback channel is known to be down, and when it is not yet confirmed. This will be useful in the 4.1 case. Also, we don't seem to be using the fact that this field is atomic. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
1d1bc8f2 |
|
04-Oct-2010 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: support BIND_CONN_TO_SESSION Basic xdr and processing for BIND_CONN_TO_SESSION. This adds a connection to the list of connections associated with a session. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
6f3d772f |
|
14-Dec-2010 |
Takuma Umeya <tumeya@redhat.com> |
nfs4: set source address when callback is generated when callback is generated in NFSv4 server, it doesn't set the source address. When an alias IP is utilized on NFSv4 server and suppose the client is accessing via that alias IP (e.g. eth0:0), the client invokes the callback to the IP address that is set on the original device (e.g. eth0). This behavior results in timeout of xprt. The patch sets the IP address that the client should invoke callback to. Signed-off-by: Takuma Umeya <tumeya@redhat.com> [bfields@redhat.com: Simplify gen_callback arguments, use helper function] Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
c84d500b |
|
30-Oct-2010 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: use a single struct file for delegations When we converted to sharing struct filess between nfs4 opens I went too far and also used the same mechanism for delegations. But keeping a reference to the struct file ensures it will outlast the lease, and allows us to remove the lease with the same file as we added it. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
8323c3b2 |
|
19-Oct-2010 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: move minorversion to client The minorversion seems more a property of the client than the callback channel. Some time we should probably also enforce consistent minorversion usage from the client; for now, this is just a cosmetic change. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
5a3c9d71 |
|
19-Oct-2010 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: separate callback change and callback probe Only one of the nfsd4_callback_probe callers actually cares about changing the callback information. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
8b5ce5cd |
|
19-Oct-2010 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: callback program number is per-session The callback program is allowed to depend on the session which the callback is going over. No change in behavior yet, while we still only do callbacks over a single session for the lifetime of the client. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
ac7c46f2 |
|
14-Jun-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: make backchannel sequence number per-session Currently we don't deal well with a client that has multiple sessions associated with it (even simultaneously, or serially over the lifetime of the client). In particular, we don't attempt to keep the backchannel running after the original session diseappears. We will fix that soon. Once we do that, we need the slot sequence number to be per-session; otherwise, for example, we cannot correctly handle a case like this: - All session 1 connections are lost. - The client creates session 2. We use it for the backchannel (since it's the only working choice). - The client gives us a new connection to use with session 1. - The client destroys session 2. At this point our only choice is to go back to using session 1. When we do so we must use the sequence number that is next for session 1. We therefore need to maintain multiple sequence number streams. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
90c8145b |
|
14-Jun-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: use client pointer to backchannel session Instead of copying the sessionid, use the new cl_cb_session pointer, which indicates which session we're using for the backchannel. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
edd76786 |
|
14-Jun-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: move callback setup into session init code The backchannel should be associated with a session, it isn't really global to the client. We do, however, want a pointer global to the client which tracks which session we're currently using for client-based callbacks. This is a first step in that direction; for now, just reshuffling of code with no significant change in behavior. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
19cf5c02 |
|
06-Jun-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: use callbacks on svc_xprt_deletion Remove connections from the list when they go down. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
c7662518 |
|
06-Jun-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: keep per-session list of connections The spec requires us in various places to keep track of the connections associated with each session. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
6ff8da08 |
|
04-Jun-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: Move callback setup to callback queue Instead of creating the new rpc client from a regular server thread, set a flag, kick off a null call, and allow the null call to do the work of setting up the client on the callback workqueue. Use a spinlock to ensure the callback work gets a consistent view of the callback parameters. This allows, for example, changing the callback from contexts where sleeping is not allowed. I hope it will also keep the locking simple as we add more session and trunking features, by serializing most of the callback-specific work. This also closes a small race where the the new cb_ident could be used with an old connection (or vice-versa). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
fb003923 |
|
31-May-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: remove separate cb_args struct I don't see the point of the separate struct. It seems to just be getting in the way. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
cee277d9 |
|
26-May-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: use generic callback code in null case This will eventually allow us, for example, to kick off null callback from contexts where we can't sleep. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
5878453d |
|
16-May-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: generic callback code Make the recall callback code more generic, so that other callbacks will be able to use it too. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
1c855602 |
|
26-May-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: rename nfs4_rpc_args->nfsd4_cb_args With apologies for the gratuitous rename, the new name seems more helpful to me. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
586f3673 |
|
26-May-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: combine nfs4_rpc_args and nfsd4_cb_sequence These two structs don't really need to be distinct as far as I can tell. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
7d947842 |
|
20-Aug-2010 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: fix downgrade/lock logic If we already had a RW open for a file, and get a readonly open, we were piggybacking on the existing RW open. That's inconsistent with the downgrade logic which blows away the RW open assuming you'll still have a readonly open. Also, make sure there is a readonly or writeonly open available for locking, again to prevent bad behavior in downgrade cases when any RW open may be lost. Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
18608ad4 |
|
20-Aug-2010 |
J. Bruce Fields <bfields@redhat.com> |
nfsd4: typo fix in find_any_file Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f9d7562f |
|
08-Jul-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: share file descriptors between stateid's The vfs doesn't really allow us to "upgrade" a file descriptor from read-only to read-write, and our attempt to do so in nfs4_upgrade_open is ugly and incomplete. Move to a different scheme where we keep multiple opens, shared between open stateid's, in the nfs4_file struct. Each file will be opened at most 3 times (for read, write, and read-write), and those opens will be shared between all clients and openers. On upgrade we will do another open if necessary instead of attempting to upgrade an existing open. We keep count of the number of readers and writers so we know when to close the shared files. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
d7682988 |
|
11-May-2010 |
Benny Halevy <bhalevy@panasas.com> |
nfsd4: keep a reference count on client while in use Get a refcount on the client on SEQUENCE, Release the refcount and renew the client when all respective compounds completed. Do not expire the client by the laundromat while in use. If the client was expired via another path, free it when the compounds complete and the refcount reaches 0. Note that unhash_client_locked must call list_del_init on cl_lru as it may be called twice for the same client (once from nfs4_laundromat and then from expire_client) Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
07cd4909 |
|
11-May-2010 |
Benny Halevy <bhalevy@panasas.com> |
nfsd4: mark_client_expired Mark the client as expired under the client_lock so it won't be renewed when an nfsv4.1 session is done, after it was explicitly expired during processing of the compound. Do not renew a client mark as expired (in particular, it is not on the lru list anymore) Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
46583e25 |
|
11-May-2010 |
Benny Halevy <bhalevy@panasas.com> |
nfsd4: introduce nfs4_client.cl_refcount Currently just initialize the cl_refcount to 1 and decrement in expire_client(), conditionally freeing the client when the refcount reaches 0. To be used later by nfsv4.1 compounds to keep the client from timing out while in use. Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
4b21d0de |
|
07-Mar-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: allow 4.0 clients to change callback path The rfc allows a client to change the callback parameters, but we didn't previously implement it. Teach the callbacks to rerun themselves (by placing themselves on a workqueue) when they recognize that their rpc task has been killed and that the callback connection has changed. Then we can change the callback connection by setting up a new rpc client, modifying the nfs4 client to point at it, waiting for any work in progress to complete, and then shutting down the old client. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
2bf23875 |
|
07-Mar-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: rearrange cb data structures Mainly I just want to separate the arguments used for setting up the tcp client from the rest. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
b12a05cb |
|
04-Mar-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: cl_count is unused Now that the shutdown sequence guarantees callbacks are shut down before the client is destroyed, we no longer have a use for cl_count. We'll probably reinstate a reference count on the client some day, but it will be held by users other than callbacks. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
b5a1a81e |
|
03-Mar-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: don't sleep in lease-break callback The NFSv4 server's fl_break callback can sleep (dropping the BKL), in order to allocate a new rpc task to send a recall to the client. As far as I can tell this doesn't cause any races in the current code, but the analysis is difficult. Also, the sleep here may complicate the move away from the BKL. So, just schedule some work to do the job for us instead. The work will later also prove useful for restarting a call after the callback information is changed. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
227f98d9 |
|
18-Feb-2010 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd4: preallocate nfs4_rpc_args Instead of allocating this small structure, just include it in the delegation. The nfsd4_callback structure isn't really necessary yet, but we plan to add to it all the information necessary to perform a callback. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
7663dacd |
|
04-Dec-2009 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd: remove pointless paths in file headers The new .h files have paths at the top that are now out of date. While we're here, just remove all of those from fs/nfsd; they never served any purpose. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
1557aca7 |
|
04-Dec-2009 |
J. Bruce Fields <bfields@citi.umich.edu> |
nfsd: move most of nfsfh.h to fs/nfsd Most of this can be trivially moved to a private header as well. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|
#
9a74af21 |
|
03-Dec-2009 |
Boaz Harrosh <bharrosh@panasas.com> |
nfsd: Move private headers to source directory Lots of include/linux/nfsd/* headers are only used by nfsd module. Move them to the source directory Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
|