History log of /linux-master/fs/lockd/svc.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# a2214ed5 26-Jan-2024 Josef Bacik <josef@toxicpanda.com>

nfsd: stop setting ->pg_stats for unused stats

A lot of places are setting a blank svc_stats in ->pg_stats and never
utilizing these stats. Remove all of these extra structs as we're not
reporting these stats anywhere.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 1e3577a4 14-Dec-2023 NeilBrown <neilb@suse.de>

SUNRPC: discard sv_refcnt, and svc_get/svc_put

sv_refcnt is no longer useful.
lockd and nfs-cb only ever have the svc active when there are a non-zero
number of threads, so sv_refcnt mirrors sv_nrthreads.

nfsd also keeps the svc active between when a socket is added and when
the first thread is started, but we don't really need a refcount for
that. We can simply not destroy the svc while there are any permanent
sockets attached.

So remove sv_refcnt and the get/put functions.
Instead of a final call to svc_put(), call svc_destroy() instead.
This is changed to also store NULL in the passed-in pointer to make it
easier to avoid use-after-free situations.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 9d5b9475 20-Nov-2023 Joel Granados <j.granados@samsung.com>

fs: Remove the now superfluous sentinel elements from ctl_table array

This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)

Remove sentinel elements ctl_table struct. Special attention was placed in
making sure that an empty directory for fs/verity was created when
CONFIG_FS_VERITY_BUILTIN_SIGNATURES is not defined. In this case we use the
register sysctl call that expects a size.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>


# fa341560 11-Sep-2023 NeilBrown <neilb@suse.de>

SUNRPC: change how svc threads are asked to exit.

svc threads are currently stopped using kthread_stop(). This requires
identifying a specific thread. However we don't care which thread
stops, just as long as one does.

So instead, set a flag in the svc_pool to say that a thread needs to
die, and have each thread check this flag instead of calling
kthread_should_stop(). The first thread to find and clear this flag
then moves towards exiting.

This removes an explicit dependency on sp_all_threads which will make a
future patch simpler.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# f4578ba1 10-Oct-2023 NeilBrown <neilb@suse.de>

lockd: hold a reference to nlmsvc_serv while stopping the thread.

Both nfsd and nfsv4-callback take a temporary reference to the svc_serv
while calling svc_set_num_threads() to stop the last thread. lockd does
not.

This extra reference prevents the scv_serv from being freed when the
last thread drops its reference count. This is not currently needed
for lockd as the svc_serv is not accessed after the last thread is told
to exit.

However a future patch will require svc_exit_thread() to access the
svc_serv after the svc_put() so it will need the code that calls
svc_set_num_threads() to keep a reference and keep the svc_serv active.

So copy the pattern from nfsd and nfsv4-cb to lockd, and take a
reference around svc_set_num_threads(.., 0)

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 78c542f9 29-Jul-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Add enum svc_auth_status

In addition to the benefits of using an enum rather than a set of
macros, we now have a named type that can improve static type
checking of function return values.

As part of this change, I removed a stale comment from svcauth.h;
the return values from current implementations of the
auth_ops::release method are all zero/negative errno, not the SVC_OK
enum values as the old comment suggested.

Suggested-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# c743b425 18-Jul-2023 NeilBrown <neilb@suse.de>

SUNRPC: remove timeout arg from svc_recv()

Most svc threads have no interest in a timeout.
nfsd sets it to 1 hour, but this is a wart of no significance.

lockd uses the timeout so that it can call nlmsvc_retry_blocked().
It also sometimes calls svc_wake_up() to ensure this is called.

So change lockd to be consistent and always use svc_wake_up() to trigger
nlmsvc_retry_blocked() - using a timer instead of a timeout to
svc_recv().

And change svc_recv() to not take a timeout arg.

This makes the sp_threads_timedout counter always zero.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 7b719e2b 18-Jul-2023 NeilBrown <neilb@suse.de>

SUNRPC: change svc_recv() to return void.

svc_recv() currently returns a 0 on success or one of two errors:
- -EAGAIN means no message was successfully received
- -EINTR means the thread has been told to stop

Previously nfsd would stop as the result of a signal as well as
following kthread_stop(). In that case the difference was useful: EINTR
means stop unconditionally. EAGAIN means stop if kthread_should_stop(),
continue otherwise.

Now threads only exit when kthread_should_stop() so we don't need the
distinction.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# f78116d3 18-Jul-2023 NeilBrown <neilb@suse.de>

SUNRPC: call svc_process() from svc_recv().

All callers of svc_recv() go on to call svc_process() on success.
Simplify callers by having svc_recv() do that for them.

This loses one call to validate_process_creds() in nfsd. That was
debugging code added 14 years ago. I don't think we need to keep it.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 8db14cad 18-Jul-2023 NeilBrown <neilb@suse.de>

lockd: remove SIGKILL handling

lockd allows SIGKILL and responds by dropping all locks and restarting
the grace period. This functionality has been present since 2.1.32 when
lockd was added to Linux.

This functionality is undocumented and most likely added as a useful
debug aid. When there is a need to drop locks, the better approach is
to use /proc/fs/nfsd/unlock_*.

This patch removes SIGKILL handling as part of preparation for removing
all signal handling from sunrpc service threads.

Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 665e89ab 02-Jun-2023 NeilBrown <neilb@suse.de>

lockd: drop inappropriate svc_get() from locked_get()

The below-mentioned patch was intended to simplify refcounting on the
svc_serv used by locked. The goal was to only ever have a single
reference from the single thread. To that end we dropped a call to
lockd_start_svc() (except when creating thread) which would take a
reference, and dropped the svc_put(serv) that would drop that reference.

Unfortunately we didn't also remove the svc_get() from
lockd_create_svc() in the case where the svc_serv already existed.
So after the patch:
- on the first call the svc_serv was allocated and the one reference
was given to the thread, so there are no extra references
- on subsequent calls svc_get() was called so there is now an extra
reference.
This is clearly not consistent.

The inconsistency is also clear in the current code in lockd_get()
takes *two* references, one on nlmsvc_serv and one by incrementing
nlmsvc_users. This clearly does not match lockd_put().

So: drop that svc_get() from lockd_get() (which used to be in
lockd_create_svc().

Reported-by: Ido Schimmel <idosch@idosch.org>
Closes: https://lore.kernel.org/linux-nfs/ZHsI%2FH16VX9kJQX1@shredder/T/#u
Fixes: b73a2972041b ("lockd: move lockd_start_svc() call into lockd_create_svc()")
Signed-off-by: NeilBrown <neilb@suse.de>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# fc412a61 02-May-2023 Tom Rix <trix@redhat.com>

lockd: define nlm_port_min,max with CONFIG_SYSCTL

gcc with W=1 and ! CONFIG_SYSCTL
fs/lockd/svc.c:80:51: error: ‘nlm_port_max’ defined but not used [-Werror=unused-const-variable=]
80 | static const int nlm_port_min = 0, nlm_port_max = 65535;
| ^~~~~~~~~~~~
fs/lockd/svc.c:80:33: error: ‘nlm_port_min’ defined but not used [-Werror=unused-const-variable=]
80 | static const int nlm_port_min = 0, nlm_port_max = 65535;
| ^~~~~~~~~~~~

The only use of these variables is when CONFIG_SYSCTL
is defined, so their definition should be likewise conditional.

Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# c1d889cf 10-Mar-2023 Luis Chamberlain <mcgrof@kernel.org>

lockd: simplify two-level sysctl registration for nlm_sysctls

There is no need to declare two tables to just create directories,
this can be easily be done with a prefix path with register_sysctl().

Simplify this registration.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 37b768ce 10-Mar-2023 Luis Chamberlain <mcgrof@kernel.org>

lockd: simplify two-level sysctl registration for nlm_sysctls

There is no need to declare two tables to just create directories,
this can be easily be done with a prefix path with register_sysctl().

Simplify this registration.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>


# f1aa2eb5 10-Feb-2023 Ondrej Mosnacek <omosnace@redhat.com>

sysctl: fix proc_dobool() usability

Currently proc_dobool expects a (bool *) in table->data, but sizeof(int)
in table->maxsize, because it uses do_proc_dointvec() directly.

This is unsafe for at least two reasons:
1. A sysctl table definition may use { .data = &variable, .maxsize =
sizeof(variable) }, not realizing that this makes the sysctl unusable
(see the Fixes: tag) and that they need to use the completely
counterintuitive sizeof(int) instead.
2. proc_dobool() will currently try to parse an array of values if given
.maxsize >= 2*sizeof(int), but will try to write values of type bool
by offsets of sizeof(int), so it will not work correctly with neither
an (int *) nor a (bool *). There is no .maxsize validation to prevent
this.

Fix this by:
1. Constraining proc_dobool() to allow only one value and .maxsize ==
sizeof(bool).
2. Wrapping the original struct ctl_table in a temporary one with .data
pointing to a local int variable and .maxsize set to sizeof(int) and
passing this one to proc_dointvec(), converting the value to/from
bool as needed (using proc_dou8vec_minmax() as an example).
3. Extending sysctl_check_table() to enforce proc_dobool() expectations.
4. Fixing the proc_dobool() docstring (it was just copy-pasted from
proc_douintvec, apparently...).
5. Converting all existing proc_dobool() users to set .maxsize to
sizeof(bool) instead of sizeof(int).

Fixes: 83efeeeb3d04 ("tty: Allow TIOCSTI to be disabled")
Fixes: a2071573d634 ("sysctl: introduce new proc handler proc_dobool")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>


# 65ba3d24 10-Jan-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Use per-CPU counters to tally server RPC counts

- Improves counting accuracy
- Reduces cross-CPU memory traffic

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# cee4db19 08-Jan-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Refactor RPC server dispatch method

Currently, svcauth_gss_accept() pre-reserves response buffer space
for the RPC payload length and GSS sequence number before returning
to the dispatcher, which then adds the header's accept_stat field.

The problem is the accept_stat field is supposed to go before the
length and seq_num fields. So svcauth_gss_release() has to relocate
the accept_stat value (see svcauth_gss_prepare_to_wrap()).

To enable these fields to be added to the response buffer in the
correct (final) order, the pointer to the accept_stat has to be made
available to svcauth_gss_accept() so that it can set it before
reserving space for the length and seq_num fields.

As a first step, move the pointer to the location of the accept_stat
field into struct svc_rqst.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 8dd41d70 08-Jan-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Push svcxdr_init_encode() into svc_process_common()

Now that all vs_dispatch functions invoke svcxdr_init_encode(), it
is common code and can be pushed down into the generic RPC server.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# dba5eaa4 01-Jan-2023 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Push svcxdr_init_decode() into svc_process_common()

Now that all vs_dispatch functions invoke svcxdr_init_decode(), it
is common code and can be pushed down into the generic RPC server.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 37902c63 15-Feb-2022 Chuck Lever <chuck.lever@oracle.com>

NFSD: Move svc_serv_ops::svo_function into struct svc_serv

Hoist svo_function back into svc_serv and remove struct
svc_serv_ops, since the struct is now devoid of fields.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# f49169c9 15-Feb-2022 Chuck Lever <chuck.lever@oracle.com>

NFSD: Remove svc_serv_ops::svo_module

struct svc_serv_ops is about to be removed.

Neil Brown says:
> I suspect svo_module can go as well - I don't think the thread is
> ever the thing that primarily keeps a module active.

A random sample of kthread_create() callers shows sunrpc is the only
one that manages module reference count in this way.

Suggested-by: Neil Brown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# c7d7ec8f 26-Jan-2022 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Remove svc_shutdown_net()

Clean up: svc_shutdown_net() now does nothing but call
svc_close_net(). Replace all external call sites.

svc_close_net() is renamed to be the inverse of svc_xprt_create().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 352ad314 26-Jan-2022 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Rename svc_create_xprt()

Clean up: Use the "svc_xprt_<task>" function naming convention as
is used for other external APIs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 87cdd864 25-Jan-2022 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Remove svo_shutdown method

Clean up. Neil observed that "any code that calls svc_shutdown_net()
knows what the shutdown function should be, and so can call it
directly."

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: NeilBrown <neilb@suse.de>


# a9ff2e99 25-Jan-2022 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Remove the .svo_enqueue_xprt method

We have never been able to track down and address the underlying
cause of the performance issues with workqueue-based service
support. svo_enqueue_xprt is called multiple times per RPC, so
it adds instruction path length, but always ends up at the same
function: svc_xprt_do_enqueue(). We do not anticipate needing
this flexibility for dynamic nfsd thread management support.

As a micro-optimization, remove .svo_enqueue_xprt because
Spectre/Meltdown makes virtual function calls more costly.

This change essentially reverts commit b9e13cdfac70 ("nfsd/sunrpc:
turn enqueueing a svc_xprt into a svc_serv operation").

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 35ce8ae9 16-Jan-2022 Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull signal/exit/ptrace updates from Eric Biederman:
"This set of changes deletes some dead code, makes a lot of cleanups
which hopefully make the code easier to follow, and fixes bugs found
along the way.

The end-game which I have not yet reached yet is for fatal signals
that generate coredumps to be short-circuit deliverable from
complete_signal, for force_siginfo_to_task not to require changing
userspace configured signal delivery state, and for the ptrace stops
to always happen in locations where we can guarantee on all
architectures that the all of the registers are saved and available on
the stack.

Removal of profile_task_ext, profile_munmap, and profile_handoff_task
are the big successes for dead code removal this round.

A bunch of small bug fixes are included, as most of the issues
reported were small enough that they would not affect bisection so I
simply added the fixes and did not fold the fixes into the changes
they were fixing.

There was a bug that broke coredumps piped to systemd-coredump. I
dropped the change that caused that bug and replaced it entirely with
something much more restrained. Unfortunately that required some
rebasing.

Some successes after this set of changes: There are few enough calls
to do_exit to audit in a reasonable amount of time. The lifetime of
struct kthread now matches the lifetime of struct task, and the
pointer to struct kthread is no longer stored in set_child_tid. The
flag SIGNAL_GROUP_COREDUMP is removed. The field group_exit_task is
removed. Issues where task->exit_code was examined with
signal->group_exit_code should been examined were fixed.

There are several loosely related changes included because I am
cleaning up and if I don't include them they will probably get lost.

The original postings of these changes can be found at:
https://lkml.kernel.org/r/87a6ha4zsd.fsf@email.froward.int.ebiederm.org
https://lkml.kernel.org/r/87bl1kunjj.fsf@email.froward.int.ebiederm.org
https://lkml.kernel.org/r/87r19opkx1.fsf_-_@email.froward.int.ebiederm.org

I trimmed back the last set of changes to only the obviously correct
once. Simply because there was less time for review than I had hoped"

* 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (44 commits)
ptrace/m68k: Stop open coding ptrace_report_syscall
ptrace: Remove unused regs argument from ptrace_report_syscall
ptrace: Remove second setting of PT_SEIZED in ptrace_attach
taskstats: Cleanup the use of task->exit_code
exit: Use the correct exit_code in /proc/<pid>/stat
exit: Fix the exit_code for wait_task_zombie
exit: Coredumps reach do_group_exit
exit: Remove profile_handoff_task
exit: Remove profile_task_exit & profile_munmap
signal: clean up kernel-doc comments
signal: Remove the helper signal_group_exit
signal: Rename group_exit_task group_exec_task
coredump: Stop setting signal->group_exit_task
signal: Remove SIGNAL_GROUP_COREDUMP
signal: During coredumps set SIGNAL_GROUP_EXIT in zap_process
signal: Make coredump handling explicit in complete_signal
signal: Have prepare_signal detect coredumps using signal->core_state
signal: Have the oom killer detect coredumps using signal->core_state
exit: Move force_uaccess back into do_exit
exit: Guarantee make_task_dead leaks the tsk when calling do_task_exit
...


# 6b044fba 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: use svc_set_num_threads() for thread start and stop

svc_set_num_threads() does everything that lockd_start_svc() does, except
set sv_maxconn. It also (when passed 0) finds the threads and
stops them with kthread_stop().

So move the setting for sv_maxconn, and use svc_set_num_thread()

We now don't need nlmsvc_task.

Now that we use svc_set_num_threads() it makes sense to set svo_module.
This request that the thread exists with module_put_and_exit().
Also fix the documentation for svo_module to make this explicit.

svc_prepare_thread is now only used where it is defined, so it can be
made static.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# ecd3ad68 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: rename lockd_create_svc() to lockd_get()

lockd_create_svc() already does an svc_get() if the service already
exists, so it is more like a "get" than a "create".

So:
- Move the increment of nlmsvc_users into the function as well
- rename to lockd_get().

It is now the inverse of lockd_put().

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# 865b6740 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: introduce lockd_put()

There is some cleanup that is duplicated in lockd_down() and the failure
path of lockd_up().
Factor these out into a new lockd_put() and call it from both places.

lockd_put() does *not* take the mutex - that must be held by the caller.
It decrements nlmsvc_users and if that reaches zero, it cleans up.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# 6a4e2527 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: move svc_exit_thread() into the thread

The normal place to call svc_exit_thread() is from the thread itself
just before it exists.
Do this for lockd.

This means that nlmsvc_rqst is not used out side of lockd_start_svc(),
so it can be made local to that function, and renamed to 'rqst'.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# b73a2972 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: move lockd_start_svc() call into lockd_create_svc()

lockd_start_svc() only needs to be called once, just after the svc is
created. If the start fails, the svc is discarded too.

It thus makes sense to call lockd_start_svc() from lockd_create_svc().
This allows us to remove the test against nlmsvc_rqst at the start of
lockd_start_svc() - it must always be NULL.

lockd_up() only held an extra reference on the svc until a thread was
created - then it dropped it. The thread - and thus the extra reference
- will remain until kthread_stop() is called.
Now that the thread is created in lockd_create_svc(), the extra
reference can be dropped there. So the 'serv' variable is no longer
needed in lockd_up().

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# 5a8a7ff5 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: simplify management of network status notifiers

Now that the network status notifiers use nlmsvc_serv rather then
nlmsvc_rqst the management can be simplified.

Notifier unregistration synchronises with any pending notifications so
providing we unregister before nlm_serv is freed no further interlock
is required.

So we move the unregister call to just before the thread is killed
(which destroys the service) and just before the service is destroyed in
the failure-path of lockd_up().

Then nlm_ntf_refcnt and nlm_ntf_wq can be removed.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# 2840fe86 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: introduce nlmsvc_serv

lockd has two globals - nlmsvc_task and nlmsvc_rqst - but mostly it
wants the 'struct svc_serv', and when it doesn't want it exactly it can
get to what it wants from the serv.

This patch is a first step to removing nlmsvc_task and nlmsvc_rqst. It
introduces nlmsvc_serv to store the 'struct svc_serv*'. This is set as
soon as the serv is created, and cleared only when it is destroyed.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# ec52361d 28-Nov-2021 NeilBrown <neilb@suse.de>

SUNRPC: stop using ->sv_nrthreads as a refcount

The use of sv_nrthreads as a general refcount results in clumsy code, as
is seen by various comments needed to explain the situation.

This patch introduces a 'struct kref' and uses that for reference
counting, leaving sv_nrthreads to be a pure count of threads. The kref
is managed particularly in svc_get() and svc_put(), and also nfsd_put();

svc_destroy() now takes a pointer to the embedded kref, rather than to
the serv.

nfsd allows the svc_serv to exist with ->sv_nrhtreads being zero. This
happens when a transport is created before the first thread is started.
To support this, a 'keep_active' flag is introduced which holds a ref on
the svc_serv. This is set when any listening socket is successfully
added (unless there are running threads), and cleared when the number of
threads is set. So when the last thread exits, the nfs_serv will be
destroyed.
The use of 'keep_active' replaces previous code which checked if there
were any permanent sockets.

We no longer clear ->rq_server when nfsd() exits. This was done
to prevent svc_exit_thread() from calling svc_destroy().
Instead we take an extra reference to the svc_serv to prevent
svc_destroy() from being called.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# 8c62d127 28-Nov-2021 NeilBrown <neilb@suse.de>

SUNRPC/NFSD: clean up get/put functions.

svc_destroy() is poorly named - it doesn't necessarily destroy the svc,
it might just reduce the ref count.
nfsd_destroy() is poorly named for the same reason.

This patch:
- removes the refcount functionality from svc_destroy(), moving it to
a new svc_put(). Almost all previous callers of svc_destroy() now
call svc_put().
- renames nfsd_destroy() to nfsd_put() and improves the code, using
the new svc_destroy() rather than svc_put()
- removes a few comments that explain the important for balanced
get/put calls. This should be obvious.

The only non-trivial part of this is that svc_destroy() would call
svc_sock_update() on a non-final decrement. It can no longer do that,
and svc_put() isn't really a good place of it. This call is now made
from svc_exit_thread() which seems like a good place. This makes the
call *before* sv_nrthreads is decremented rather than after. This
is not particularly important as the call just sets a flag which
causes sv_nrthreads set be checked later. A subsequent patch will
improve the ordering.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# df5e49c8 28-Nov-2021 NeilBrown <neilb@suse.de>

SUNRPC: change svc_get() to return the svc.

It is common for 'get' functions to return the object that was 'got',
and there are a couple of places where users of svc_get() would be a
little simpler if svc_get() did that.

Make it so.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# fda49441 13-Oct-2021 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Replace the "__be32 *p" parameter to .pc_encode

The passed-in value of the "__be32 *p" parameter is now unused in
every server-side XDR encoder, and can be removed.

Note also that there is a line in each encoder that sets up a local
pointer to a struct xdr_stream. Passing that pointer from the
dispatcher instead saves one line per encoder function.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 16c66364 12-Oct-2021 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Replace the "__be32 *p" parameter to .pc_decode

The passed-in value of the "__be32 *p" parameter is now unused in
every server-side XDR decoder, and can be removed.

Note also that there is a line in each decoder that sets up a local
pointer to a struct xdr_stream. Passing that pointer from the
dispatcher instead saves one line per decoder function.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 0961f0c0 04-Sep-2021 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'nfs-for-5.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client updates from Anna Schumaker:
"New Features:
- Better client responsiveness when server isn't replying
- Use refcount_t in sunrpc rpc_client refcount tracking
- Add srcaddr and dst_port to the sunrpc sysfs info files
- Add basic support for connection sharing between servers with multiple NICs`

Bugfixes and Cleanups:
- Sunrpc tracepoint cleanups
- Disconnect after ib_post_send() errors to avoid deadlocks
- Fix for tearing down rpcrdma_reps
- Fix a potential pNFS layoutget livelock loop
- pNFS layout barrier fixes
- Fix a potential memory corruption in rpc_wake_up_queued_task_set_status()
- Fix reconnection locking
- Fix return value of get_srcport()
- Remove rpcrdma_post_sends()
- Remove pNFS dead code
- Remove copy size restriction for inter-server copies
- Overhaul the NFS callback service
- Clean up sunrpc TCP socket shutdowns
- Always provide aligned buffers to RPC read layers"

* tag 'nfs-for-5.15-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (39 commits)
NFS: Always provide aligned buffers to the RPC read layers
NFSv4.1 add network transport when session trunking is detected
SUNRPC enforce creation of no more than max_connect xprts
NFSv4 introduce max_connect mount options
SUNRPC add xps_nunique_destaddr_xprts to xprt_switch_info in sysfs
SUNRPC keep track of number of transports to unique addresses
NFSv3: Delete duplicate judgement in nfs3_async_handle_jukebox
SUNRPC: Tweak TCP socket shutdown in the RPC client
SUNRPC: Simplify socket shutdown when not reusing TCP ports
NFSv4.2: remove restriction of copy size for inter-server copy.
NFS: Clean up the synopsis of callback process_op()
NFS: Extract the xdr_init_encode/decode() calls from decode_compound
NFS: Remove unused callback void decoder
NFS: Add a private local dispatcher for NFSv4 callback operations
SUNRPC: Eliminate the RQ_AUTHERR flag
SUNRPC: Set rq_auth_stat in the pg_authenticate() callout
SUNRPC: Add svc_rqst::rq_auth_stat
SUNRPC: Add dst_port to the sysfs xprt info file
SUNRPC: Add srcaddr as a file in sysfs
sunrpc: Fix return value of get_srcport()
...


# d02a3a2c 02-Aug-2021 Jia He <hejianet@gmail.com>

lockd: change the proc_handler for nsm_use_hostnames

nsm_use_hostnames is a module parameter and it will be exported to sysctl
procfs. This is to let user sometimes change it from userspace. But the
minimal unit for sysctl procfs read/write it sizeof(int).
In big endian system, the converting from/to bool to/from int will cause
error for proc items.

This patch use a new proc_handler proc_dobool to fix it.

Signed-off-by: Jia He <hejianet@gmail.com>
Reviewed-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
[thuth: Fix typo in commit message]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

# 5c2465df 15-Jul-2021 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Set rq_auth_stat in the pg_authenticate() callout

In a few moments, rq_auth_stat will need to be explicitly set to
rpc_auth_ok before execution gets to the dispatcher.

svc_authenticate() already sets it, but it often gets reset to
rpc_autherr_badcred right after that call, even when authentication
is successful. Let's ensure that the pg_authenticate callout and
svc_set_client() set it properly in every case.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

# a9ad1a80 03-Jun-2021 Chuck Lever <chuck.lever@oracle.com>

lockd: Create a simplified .vs_dispatch method for NLM requests

To enable xdr_stream-based encoding and decoding, create a bespoke
RPC dispatch function for the lockd service.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 09c434b8 19-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Add SPDX license identifier for more missed files

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have MODULE_LICENCE("GPL*") inside which was used in the initial
scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

# 40373b12 08-Apr-2019 Trond Myklebust <trondmy@gmail.com>

lockd: Pass the user cred from knfsd when starting the lockd server

When starting up a new knfsd server, pass the user cred to the
supporting lockd server.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 4df493a2 08-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Cache the process user cred in the RPC server listener

In order to be able to interpret uids and gids correctly in knfsd, we
should cache the user namespace of the process that created the RPC
server's listener. To do so, we refcount the credential of that process.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 642ee6b2 09-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Allow further customisation of RPC program registration

Add a callback to allow customisation of the rpcbind registration.
When clients have the ability to turn on and off version support,
we want to allow them to also prevent registration of those
versions with the rpc portmapper.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 8e5b6773 09-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Add a callback to initialise server requests

Add a callback to help initialise server requests before they are
processed. This will allow us to clean up the NFS server version
support, and to make it container safe.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 554faf28 07-Feb-2018 Colin Ian King <colin.king@intel.com>

lockd: make nlm_ntf_refcnt and nlm_ntf_wq static

The variables nlm_ntf_refcnt and nlm_ntf_wq are local to the source and
do not need to be in global scope, so make them static.

Cleans up sparse warnings:
fs/lockd/svc.c:60:10: warning: symbol 'nlm_ntf_refcnt' was not declared.
Should it be static?
fs/lockd/svc.c:61:1: warning: symbol 'nlm_ntf_wq' was not declared.
Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 6b18dd1c 10-Nov-2017 Vasily Averin <vvs@virtuozzo.com>

race of lockd inetaddr notifiers vs nlmsvc_rqst change

lockd_inet[6]addr_event use nlmsvc_rqst without taken nlmsvc_mutex,
nlmsvc_rqst can be changed during execution of notifiers and crash the host.

Patch enables access to nlmsvc_rqst only when it was correctly initialized
and delays its cleanup until notifiers are no longer in use.

Note that nlmsvc_rqst can be temporally set to ERR_PTR, so the "if
(nlmsvc_rqst)" check in notifiers is insufficient on its own.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 3a2b19d1 02-Nov-2017 Vasily Averin <vvs@virtuozzo.com>

lockd: lost rollback of set_grace_period() in lockd_down_net()

Commit efda760fe95ea ("lockd: fix lockd shutdown race") is incorrect,
it removes lockd_manager and disarm grace_period_end for init_net only.

If nfsd was started from another net namespace lockd_up_net() calls
set_grace_period() that adds lockd_manager into per-netns list
and queues grace_period_end delayed work.

These action should be reverted in lockd_down_net().
Otherwise it can lead to double list_add on after restart nfsd in netns,
and to use-after-free if non-disarmed delayed work will be executed after netns destroy.

Fixes: efda760fe95e ("lockd: fix lockd shutdown race")
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# a3152f14 06-Nov-2017 Vasily Averin <vvs@virtuozzo.com>

lockd: added cleanup checks in exit_net hook

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# e919b076 07-Nov-2017 Vasily Averin <vvs@virtuozzo.com>

lockd: remove net pointer from messages

Publishing of net pointer is not safe,
use net->ns.inum as net ID in debug messages

[ 171.757678] lockd_up_net: per-net data created; net=f00001e7
[ 171.767188] NFSD: starting 90-second grace period (net f00001e7)
[ 300.653313] lockd: nuking all hosts in net f00001e7...
[ 300.653641] lockd: host garbage collection for net f00001e7
[ 300.653968] lockd: nlmsvc_mark_resources for net f00001e7
[ 300.711483] lockd_down_net: per-net data destroyed; net=f00001e7
[ 300.711847] lockd: nuking all hosts in net 0...
[ 300.711847] lockd: host garbage collection for net 0
[ 300.711848] lockd: nlmsvc_mark_resources for net 0

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 4dd3c2e5 18-Nov-2017 Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
"Lots of good bugfixes, including:

- fix a number of races in the NFSv4+ state code

- fix some shutdown crashes in multiple-network-namespace cases

- relax our 4.1 session limits; if you've an artificially low limit
to the number of 4.1 clients that can mount simultaneously, try
upgrading"

* tag 'nfsd-4.15' of git://linux-nfs.org/~bfields/linux: (22 commits)
SUNRPC: Improve ordering of transport processing
nfsd: deal with revoked delegations appropriately
svcrdma: Enqueue after setting XPT_CLOSE in completion handlers
nfsd: use nfs->ns.inum as net ID
rpc: remove some BUG()s
svcrdma: Preserve CB send buffer across retransmits
nfds: avoid gettimeofday for nfssvc_boot time
fs, nfsd: convert nfs4_file.fi_ref from atomic_t to refcount_t
fs, nfsd: convert nfs4_cntl_odstate.co_odcount from atomic_t to refcount_t
fs, nfsd: convert nfs4_stid.sc_count from atomic_t to refcount_t
lockd: double unregister of inetaddr notifiers
nfsd4: catch some false session retries
nfsd4: fix cached replies to solo SEQUENCE compounds
sunrcp: make function _svc_create_xprt static
SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status
nfsd: use ARRAY_SIZE
nfsd: give out fewer session slots as limit approaches
nfsd: increase DRC cache limit
nfsd: remove unnecessary nofilehandle checks
nfs_common: convert int to bool
...


# dc3033e1 20-Oct-2017 Vasily Averin <vvs@virtuozzo.com>

lockd: double unregister of inetaddr notifiers

lockd_up() can call lockd_unregister_notifiers twice:
inside lockd_start_svc() when it calls lockd_svc_exit_thread()
and then in error path of lockd_up()

Patch forces lockd_start_svc() to unregister notifiers in all error cases
and removes extra unregister in error path of lockd_up().

Fixes: cb7d224f82e4 "lockd: unregister notifier blocks if the service ..."
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# e4dca7b7 17-Oct-2017 Kees Cook <keescook@chromium.org>

treewide: Fix function prototypes for module_param_call()

Several function prototypes for the set/get functions defined by
module_param_call() have a slightly wrong argument types. This fixes
those in an effort to clean up the calls when running under type-enforced
compiler instrumentation for CFI. This is the result of running the
following semantic patch:

@match_module_param_call_function@
declarer name module_param_call;
identifier _name, _set_func, _get_func;
expression _arg, _mode;
@@

module_param_call(_name, _set_func, _get_func, _arg, _mode);

@fix_set_prototype
depends on match_module_param_call_function@
identifier match_module_param_call_function._set_func;
identifier _val, _param;
type _val_type, _param_type;
@@

int _set_func(
-_val_type _val
+const char * _val
,
-_param_type _param
+const struct kernel_param * _param
) { ... }

@fix_get_prototype
depends on match_module_param_call_function@
identifier match_module_param_call_function._get_func;
identifier _val, _param;
type _val_type, _param_type;
@@

int _get_func(
-_val_type _val
+char * _val
,
-_param_type _param
+const struct kernel_param * _param
) { ... }

Two additional by-hand changes are included for places where the above
Coccinelle script didn't notice them:

drivers/platform/x86/thinkpad_acpi.c
fs/lockd/svc.c

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jessica Yu <jeyu@kernel.org>

# afea5657 31-Jul-2017 Chuck Lever <chuck.lever@oracle.com>

sunrpc: Const-ify struct sv_serv_ops

Close an attack vector by moving the arrays of per-server methods to
read-only memory.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# e9679189 12-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: mark all struct svc_version instances as const

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>

# 7fd38af9 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: move pc_count out of struct svc_procinfo

pc_count is the only writeable memeber of struct svc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct svc_procinfo, and into a
separate writable array that is pointed to by struct svc_version.

Signed-off-by: Christoph Hellwig <hch@lst.de>

# efda760f 28-Mar-2017 J. Bruce Fields <bfields@redhat.com>

lockd: fix lockd shutdown race

As reported by David Jeffery: "a signal was sent to lockd while lockd
was shutting down from a request to stop nfs. The signal causes lockd
to call restart_grace() which puts the lockd_net structure on the grace
list. If this signal is received at the wrong time, it will occur after
lockd_down_net() has called locks_end_grace() but before
lockd_down_net() stops the lockd thread. This leads to lockd putting
the lockd_net structure back on the grace list, then exiting without
anything removing it from the list."

So, perform the final locks_end_grace() from the the lockd thread; this
ensures it's serialized with respect to restart_grace().

Reported-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 3f07c014 08-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>

We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

# c01410f7 05-Jan-2017 Scott Mayhew <smayhew@redhat.com>

lockd: initialize sin6_scope_id in lockd_inet6addr_event()

I noticed this was missing when I was testing with link local addresses.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# c7d03a00 16-Nov-2016 Alexey Dobriyan <adobriyan@gmail.com>

netns: make struct pernet_operations::id unsigned int

Make struct pernet_operations::id unsigned.

There are 2 reasons to do so:

1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.

2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.

"int" being used as an array index needs to be sign-extended
to 64-bit before being used.

void f(long *p, int i)
{
g(p[i]);
}

roughly translates to

movsx rsi, esi
mov rdi, [rsi+...]
call g

MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.

Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:

static inline void *net_generic(const struct net *net, int id)
{
...
ptr = ng->ptr[id - 1];
...
}

And this function is used a lot, so those sign extensions add up.

Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):

add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]

However, overall balance is in negative direction:

add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
function old new delta
nfsd4_lock 3886 3959 +73
tipc_link_build_proto_msg 1096 1140 +44
mac80211_hwsim_new_radio 2776 2808 +32
tipc_mon_rcv 1032 1058 +26
svcauth_gss_legacy_init 1413 1429 +16
tipc_bcbase_select_primary 379 392 +13
nfsd4_exchange_id 1247 1260 +13
nfsd4_setclientid_confirm 782 793 +11
...
put_client_renew_locked 494 480 -14
ip_set_sockfn_get 730 716 -14
geneve_sock_add 829 813 -16
nfsd4_sequence_done 721 703 -18
nlmclnt_lookup_host 708 686 -22
nfsd4_lockt 1085 1063 -22
nfs_get_client 1077 1050 -27
tcf_bpf_init 1106 1076 -30
nfsd4_encode_fattr 5997 5930 -67
Total: Before=154856051, After=154854321, chg -0.00%

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

# cb7d224f 30-Jun-2016 Scott Mayhew <smayhew@redhat.com>

lockd: unregister notifier blocks if the service fails to come up completely

If the lockd service fails to start up then we need to be sure that the
notifier blocks are not registered, otherwise a subsequent start of the
service could cause the same notifier to be registered twice, leading to
soft lockups.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 0751ddf77b6a "lockd: Register callbacks on the inetaddr_chain..."
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 2a297450 23-Dec-2015 Julia Lawall <Julia.Lawall@lip6.fr>

lockd: constify nlmsvc_binding structure

The nlmsvc_binding structure is never modified, so declare it as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# ea44463f 01-Jan-2016 Geliang Tang <geliangtang@163.com>

lockd: use to_delayed_work

Use to_delayed_work() instead of open-coding it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 0751ddf7 11-Dec-2015 Scott Mayhew <smayhew@redhat.com>

lockd: Register callbacks on the inetaddr_chain and inet6addr_chain

Register callbacks on inetaddr_chain and inet6addr_chain to trigger
cleanup of lockd transport sockets when an ip address is deleted.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 0d0f4aab 07-Oct-2015 Andrey Ryabinin <ryabinin.a.a@gmail.com>

lockd: get rid of reference-counted NSM RPC clients

Currently we have reference-counted per-net NSM RPC client
which created on the first monitor request and destroyed
after the last unmonitor request. It's needed because
RPC client need to know 'utsname()->nodename', but utsname()
might be NULL when nsm_unmonitor() called.

So instead of holding the rpc client we could just save nodename
in struct nlm_host and pass it to the rpc_create().
Thus ther is no need in keeping rpc client until last
unmonitor request. We could create separate RPC clients
for each monitor/unmonitor requests.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 0ad95472 23-Sep-2015 Andrey Ryabinin <ryabinin.a.a@gmail.com>

lockd: create NSM handles per net namespace

Commit cb7323fffa85 ("lockd: create and use per-net NSM
RPC clients on MON/UNMON requests") introduced per-net
NSM RPC clients. Unfortunately this doesn't make any sense
without per-net nsm_handle.

E.g. the following scenario could happen
Two hosts (X and Y) in different namespaces (A and B) share
the same nsm struct.

1. nsm_monitor(host_X) called => NSM rpc client created,
nsm->sm_monitored bit set.
2. nsm_mointor(host-Y) called => nsm->sm_monitored already set,
we just exit. Thus in namespace B ln->nsm_clnt == NULL.
3. host X destroyed => nsm->sm_count decremented to 1
4. host Y destroyed => nsm_unmonitor() => nsm_mon_unmon() => NULL-ptr
dereference of *ln->nsm_clnt

So this could be fixed by making per-net nsm_handles list,
instead of global. Thus different net namespaces will not be able
share the same nsm_handle.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# c87fb4a3 05-Aug-2015 J. Bruce Fields <bfields@redhat.com>

lockd: NLM grace period shouldn't block NFSv4 opens

NLM locks don't conflict with NFSv4 share reservations, so we're not
going to learn anything new by watiting for them.

They do conflict with NFSv4 locks and with delegations.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# b9e13cdf 08-Jun-2015 Jeff Layton <jlayton@kernel.org>

nfsd/sunrpc: turn enqueueing a svc_xprt into a svc_serv operation

For now, all services use svc_xprt_do_enqueue, but once we add
workqueue-based service support, we'll need to do something different.

Signed-off-by: Shirley Ma <shirley.ma@oracle.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# ea126e74 08-Jun-2015 Jeff Layton <jlayton@kernel.org>

nfsd/sunrpc: add a new svc_serv_ops struct and move sv_shutdown into it

In later patches we'll need to abstract out more operations on a
per-service level, besides sv_shutdown and sv_function.

Declare a new svc_serv_ops struct to hold these operations, and move
sv_shutdown into this struct.

Signed-off-by: Shirley Ma <shirley.ma@oracle.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 06bed7d1 02-Jan-2015 Trond Myklebust <trond.myklebust@primarydata.com>

LOCKD: Fix a race when initialising nlmsvc_timeout

This commit fixes a race whereby nlmclnt_init() first starts the lockd
daemon, and then calls nlm_bind_host() with the expectation that
nlmsvc_timeout has already been initialised. Unfortunately, there is no
no synchronisation between lockd() and lockd_up() to guarantee that this
is the case.

Fix is to move the initialisation of nlmsvc_timeout into lockd_create_svc

Fixes: 9a1b6bf818e74 ("LOCKD: Don't call utsname()->nodename...")
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org # 3.10.x
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>

# 0b5707e4 19-Nov-2014 Jeff Layton <jlayton@kernel.org>

sunrpc: require svc_create callers to pass in meaningful shutdown routine

Currently all svc_create callers pass in NULL for the shutdown parm,
which then gets fixed up to be svc_rpcb_cleanup if the service uses
rpcbind.

Simplify this by instead having the the only caller that requires it
(lockd) pass in svc_rpcb_cleanup and get rid of the special casing.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 6dea0737 07-Oct-2014 Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-3.18' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
"Highlights:

- support the NFSv4.2 SEEK operation (allowing clients to support
SEEK_HOLE/SEEK_DATA), thanks to Anna.
- end the grace period early in a number of cases, mitigating a
long-standing annoyance, thanks to Jeff
- improve SMP scalability, thanks to Trond"

* 'for-3.18' of git://linux-nfs.org/~bfields/linux: (55 commits)
nfsd: eliminate "to_delegation" define
NFSD: Implement SEEK
NFSD: Add generic v4.2 infrastructure
svcrdma: advertise the correct max payload
nfsd: introduce nfsd4_callback_ops
nfsd: split nfsd4_callback initialization and use
nfsd: introduce a generic nfsd4_cb
nfsd: remove nfsd4_callback.cb_op
nfsd: do not clear rpc_resp in nfsd4_cb_done_sequence
nfsd: fix nfsd4_cb_recall_done error handling
nfsd4: clarify how grace period ends
nfsd4: stop grace_time update at end of grace period
nfsd: skip subsequent UMH "create" operations after the first one for v4.0 clients
nfsd: set and test NFSD4_CLIENT_STABLE bit to reduce nfsdcltrack upcalls
nfsd: serialize nfsdcltrack upcalls for a particular client
nfsd: pass extra info in env vars to upcalls to allow for early grace period end
nfsd: add a v4_end_grace file to /proc/fs/nfsd
lockd: add a /proc/fs/lockd/nlm_end_grace file
nfsd: reject reclaim request when client has already sent RECLAIM_COMPLETE
nfsd: remove redundant boot_time parm from grace_done client tracking op
...


# d68e3c4a 12-Sep-2014 Jeff Layton <jlayton@kernel.org>

lockd: add a /proc/fs/lockd/nlm_end_grace file

Add a new procfile that will allow a (privileged) userland process to
end the NLM grace period early. The basic idea here will be to have
sm-notify write to this file, if it sent out no NOTIFY requests when
it runs. In that situation, we can generally expect that there will be
no reclaim requests so the grace period can be lifted early.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>

# f7790029 12-Sep-2014 Jeff Layton <jlayton@kernel.org>

lockd: move lockd's grace period handling into its own module

Currently, all of the grace period handling is part of lockd. Eventually
though we'd like to be able to build v4-only servers, at which point
we'll need to put all of this elsewhere.

Move the code itself into fs/nfs_common and have it build a grace.ko
module. Then, rejigger the Kconfig options so that both nfsd and lockd
enable it automatically.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>

# 7c17705e 29-Aug-2014 J. Bruce Fields <bfields@redhat.com>

lockd: fix rpcbind crash on lockd startup failure

Nikita Yuschenko reported that booting a kernel with init=/bin/sh and
then nfs mounting without portmap or rpcbind running using a busybox
mount resulted in:

# mount -t nfs 10.30.130.21:/opt /mnt
svc: failed to register lockdv1 RPC service (errno 111).
lockd_up: makesock failed, error=-111
Unable to handle kernel paging request for data at address 0x00000030
Faulting instruction address: 0xc055e65c
Oops: Kernel access of bad area, sig: 11 [#1]
MPC85xx CDS
Modules linked in:
CPU: 0 PID: 1338 Comm: mount Not tainted 3.10.44.cge #117
task: cf29cea0 ti: cf35c000 task.ti: cf35c000
NIP: c055e65c LR: c0566490 CTR: c055e648
REGS: cf35dad0 TRAP: 0300 Not tainted (3.10.44.cge)
MSR: 00029000 <CE,EE,ME> CR: 22442488 XER: 20000000
DEAR: 00000030, ESR: 00000000

GPR00: c05606f4 cf35db80 cf29cea0 cf0ded80 cf0dedb8 00000001 1dec3086
00000000
GPR08: 00000000 c07b1640 00000007 1dec3086 22442482 100b9758 00000000
10090ae8
GPR16: 00000000 000186a5 00000000 00000000 100c3018 bfa46edc 100b0000
bfa46ef0
GPR24: cf386ae0 c07834f0 00000000 c0565f88 00000001 cf0dedb8 00000000
cf0ded80
NIP [c055e65c] call_start+0x14/0x34
LR [c0566490] __rpc_execute+0x70/0x250
Call Trace:
[cf35db80] [00000080] 0x80 (unreliable)
[cf35dbb0] [c05606f4] rpc_run_task+0x9c/0xc4
[cf35dbc0] [c0560840] rpc_call_sync+0x50/0xb8
[cf35dbf0] [c056ee90] rpcb_register_call+0x54/0x84
[cf35dc10] [c056f24c] rpcb_register+0xf8/0x10c
[cf35dc70] [c0569e18] svc_unregister.isra.23+0x100/0x108
[cf35dc90] [c0569e38] svc_rpcb_cleanup+0x18/0x30
[cf35dca0] [c0198c5c] lockd_up+0x1dc/0x2e0
[cf35dcd0] [c0195348] nlmclnt_init+0x2c/0xc8
[cf35dcf0] [c015bb5c] nfs_start_lockd+0x98/0xec
[cf35dd20] [c015ce6c] nfs_create_server+0x1e8/0x3f4
[cf35dd90] [c0171590] nfs3_create_server+0x10/0x44
[cf35dda0] [c016528c] nfs_try_mount+0x158/0x1e4
[cf35de20] [c01670d0] nfs_fs_mount+0x434/0x8c8
[cf35de70] [c00cd3bc] mount_fs+0x20/0xbc
[cf35de90] [c00e4f88] vfs_kern_mount+0x50/0x104
[cf35dec0] [c00e6e0c] do_mount+0x1d0/0x8e0
[cf35df10] [c00e75ac] SyS_mount+0x90/0xd0
[cf35df40] [c000ccf4] ret_from_syscall+0x0/0x3c

The addition of svc_shutdown_net() resulted in two calls to
svc_rpcb_cleanup(); the second is no longer necessary and crashes when
it calls rpcb_register_call with clnt=NULL.

Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
Fixes: 679b033df484 "lockd: ensure we tear down any live sockets when socket creation fails during lockd_up"
Cc: stable@vger.kernel.org
Acked-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# d4e89902 02-Sep-2014 Trond Myklebust <trond.myklebust@primarydata.com>

lockd: Do not start the lockd thread before we've set nlmsvc_rqst->rq_task

This fixes an Oopsable race when starting lockd.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# d6a7ce42 03-Aug-2014 Trond Myklebust <trond.myklebust@primarydata.com>

lockd: Ensure that lockd_start_svc sets the server rq_task...

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 5b174fd6 10-Jun-2014 Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-3.16' of git://linux-nfs.org/~bfields/linux

Pull nfsd updates from Bruce Fields:
"The largest piece is a long-overdue rewrite of the xdr code to remove
some annoying limitations: for example, there was no way to return
ACLs larger than 4K, and readdir results were returned only in 4k
chunks, limiting performance on large directories.

Also:
- part of Neil Brown's work to make NFS work reliably over the
loopback interface (so client and server can run on the same
machine without deadlocks). The rest of it is coming through
other trees.
- cleanup and bugfixes for some of the server RDMA code, from
Steve Wise.
- Various cleanup of NFSv4 state code in preparation for an
overhaul of the locking, from Jeff, Trond, and Benny.
- smaller bugfixes and cleanup from Christoph Hellwig and
Kinglong Mee.

Thanks to everyone!

This summer looks likely to be busier than usual for knfsd. Hopefully
we won't break it too badly; testing definitely welcomed"

* 'for-3.16' of git://linux-nfs.org/~bfields/linux: (100 commits)
nfsd4: fix FREE_STATEID lockowner leak
svcrdma: Fence LOCAL_INV work requests
svcrdma: refactor marshalling logic
nfsd: don't halt scanning the DRC LRU list when there's an RC_INPROG entry
nfs4: remove unused CHANGE_SECURITY_LABEL
nfsd4: kill READ64
nfsd4: kill READ32
nfsd4: simplify server xdr->next_page use
nfsd4: hash deleg stateid only on successful nfs4_set_delegation
nfsd4: rename recall_lock to state_lock
nfsd: remove unneeded zeroing of fields in nfsd4_proc_compound
nfsd: fix setting of NFS4_OO_CONFIRMED in nfsd4_open
nfsd4: use recall_lock for delegation hashing
nfsd: fix laundromat next-run-time calculation
nfsd: make nfsd4_encode_fattr static
SUNRPC/NFSD: Remove using of dprintk with KERN_WARNING
nfsd: remove unused function nfsd_read_file
nfsd: getattr for FATTR4_WORD0_FILES_AVAIL needs the statfs buffer
NFSD: Error out when getting more than one fsloc/secinfo/uuid
NFSD: Using type of uint32_t for ex_nflavors instead of int
...


# 7ac9fe57 06-Jun-2014 Joe Perches <joe@perches.com>

lockd: convert use of typedef ctl_table to struct ctl_table

This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

# 12dd7ecf 01-May-2014 Kees Cook <keescook@chromium.org>

lockd: avoid warning when CONFIG_SYSCTL undefined

When building without CONFIG_SYSCTL, the compiler saw an unused
label. This moves the label into the #ifdef it is used under.

fs/lockd/svc.c: In function ‘init_nlm’:
fs/lockd/svc.c:626:1: warning: label ‘err_sysctl’ defined but not used [-Wunused-label]

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 679b033d 25-Mar-2014 Jeff Layton <jlayton@kernel.org>

lockd: ensure we tear down any live sockets when socket creation fails during lockd_up

We had a Fedora ABRT report with a stack trace like this:

kernel BUG at net/sunrpc/svc.c:550!
invalid opcode: 0000 [#1] SMP
[...]
CPU: 2 PID: 913 Comm: rpc.nfsd Not tainted 3.13.6-200.fc20.x86_64 #1
Hardware name: Hewlett-Packard HP ProBook 4740s/1846, BIOS 68IRR Ver. F.40 01/29/2013
task: ffff880146b00000 ti: ffff88003f9b8000 task.ti: ffff88003f9b8000
RIP: 0010:[<ffffffffa0305fa8>] [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
RSP: 0018:ffff88003f9b9de0 EFLAGS: 00010206
RAX: ffff88003f829628 RBX: ffff88003f829600 RCX: 00000000000041ee
RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286
RBP: ffff88003f9b9de8 R08: 0000000000017360 R09: ffff88014fa97360
R10: ffffffff8114ce57 R11: ffffea00051c9c00 R12: ffff88003f829600
R13: 00000000ffffff9e R14: ffffffff81cc7cc0 R15: 0000000000000000
FS: 00007f4fde284840(0000) GS:ffff88014fa80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4fdf5192f8 CR3: 00000000a569a000 CR4: 00000000001407e0
Stack:
ffff88003f792300 ffff88003f9b9e18 ffffffffa02de02a 0000000000000000
ffffffff81cc7cc0 ffff88003f9cb000 0000000000000008 ffff88003f9b9e60
ffffffffa033bb35 ffffffff8131c86c ffff88003f9cb000 ffff8800a5715008
Call Trace:
[<ffffffffa02de02a>] lockd_up+0xaa/0x330 [lockd]
[<ffffffffa033bb35>] nfsd_svc+0x1b5/0x2f0 [nfsd]
[<ffffffff8131c86c>] ? simple_strtoull+0x2c/0x50
[<ffffffffa033c630>] ? write_pool_threads+0x280/0x280 [nfsd]
[<ffffffffa033c6bb>] write_threads+0x8b/0xf0 [nfsd]
[<ffffffff8114efa4>] ? __get_free_pages+0x14/0x50
[<ffffffff8114eff6>] ? get_zeroed_page+0x16/0x20
[<ffffffff811dec51>] ? simple_transaction_get+0xb1/0xd0
[<ffffffffa033c098>] nfsctl_transaction_write+0x48/0x80 [nfsd]
[<ffffffff811b8b34>] vfs_write+0xb4/0x1f0
[<ffffffff811c3f99>] ? putname+0x29/0x40
[<ffffffff811b9569>] SyS_write+0x49/0xa0
[<ffffffff810fc2a6>] ? __audit_syscall_exit+0x1f6/0x2a0
[<ffffffff816962e9>] system_call_fastpath+0x16/0x1b
Code: 31 c0 e8 82 db 37 e1 e9 2a ff ff ff 48 8b 07 8b 57 14 48 c7 c7 d5 c6 31 a0 48 8b 70 20 31 c0 e8 65 db 37 e1 e9 f4 fe ff ff 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
RIP [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
RSP <ffff88003f9b9de0>

Evidently, we created some lockd sockets and then failed to create
others. make_socks then returned an error and we tried to tear down the
svc, but svc->sv_permsocks was not empty so we ended up tripping over
the BUG() in svc_destroy().

Fix this by ensuring that we tear down any live sockets we created when
socket creation is going to return an error.

Fixes: 786185b5f8abefa (SUNRPC: move per-net operations from...)
Reported-by: Raphos <raphoszap@laposte.net>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# f170168b 03-Jul-2013 Kees Cook <keescook@chromium.org>

drivers: avoid parsing names as kthread_run() format strings

Calling kthread_run with a single name parameter causes it to be handled
as a format string. Many callers are passing potentially dynamic string
content, so use "%s" in those cases to avoid any potential accidents.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

# bd81ccea 12-Oct-2012 Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-3.7' of git://linux-nfs.org/~bfields/linux

Pull nfsd update from J Bruce Fields:
"Another relatively quiet cycle. There was some progress on my
remaining 4.1 todo's, but a couple of them were just of the form
"check that we do X correctly", so didn't have much affect on the
code.

Other than that, a bunch of cleanup and some bugfixes (including an
annoying NFSv4.0 state leak and a busy-loop in the server that could
cause it to peg the CPU without making progress)."

* 'for-3.7' of git://linux-nfs.org/~bfields/linux: (46 commits)
UAPI: (Scripted) Disintegrate include/linux/sunrpc
UAPI: (Scripted) Disintegrate include/linux/nfsd
nfsd4: don't allow reclaims of expired clients
nfsd4: remove redundant callback probe
nfsd4: expire old client earlier
nfsd4: separate session allocation and initialization
nfsd4: clean up session allocation
nfsd4: minor free_session cleanup
nfsd4: new_conn_from_crses should only allocate
nfsd4: separate connection allocation and initialization
nfsd4: reject bad forechannel attrs earlier
nfsd4: enforce per-client sessions/no-sessions distinction
nfsd4: set cl_minorversion at create time
nfsd4: don't pin clientids to pseudoflavors
nfsd4: fix bind_conn_to_session xdr comment
nfsd4: cast readlink() bug argument
NFSD: pass null terminated buf to kstrtouint()
nfsd: remove duplicate init in nfsd4_cb_recall
nfsd4: eliminate redundant nfs4_free_stateid
fs/nfsd/nfs4idmap.c: adjust inconsistent IS_ERR and PTR_ERR
...


# e9406db2 18-Sep-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

lockd: per-net NSM client creation and destruction helpers introduced

NSM RPC client can be required on NFSv3 umount, when child reaper is dying (and
destroying it's mount namespace). It means, that current nsproxy is set to
NULL already, but creation of RPC client requires UTS namespace for gaining
hostname string.
This patch introduces reference counted NFS RPC clients creation and
destruction helpers (similar to RPCBIND RPC clients).

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# 5b444cc9 17-Aug-2012 J. Bruce Fields <bfields@redhat.com>

svcrpc: remove handling of unknown errors from svc_recv

svc_recv() returns only -EINTR or -EAGAIN. If we really want to worry
about the case where it has a bug that causes it to return something
else, we could stick a WARN() in svc_recv. But it's silly to require
every caller to have all this boilerplate to handle that case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 5630f7fa 25-Jul-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: move grace period management from lockd() to per-net functions

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 5ccb0066 25-Jul-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: pass actual network namespace to grace period management functions

Passed network namespace replaced hard-coded init_net

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# db9c4553 25-Jul-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: manage grace list per network namespace

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 08d44a35 25-Jul-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: make lockd manager allocated per network namespace

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 66547b02 25-Jul-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: manage grace period per network namespace

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 8dbf28e4 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: add debug message to start and stop functions

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 3d1221df 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: service start function introduced

This is just a code move, which from my POV makes the code look better.
I.e. now on start we have 3 different stages:
1) Service creation.
2) Service per-net data allocation.
3) Service start.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 7d13ec76 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: move global usage counter manipulation from error path

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 24452239 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: service creation function introduced

This function creates service if it doesn't exist, or increases usage
counter if it does, and returns a pointer to it. The usage counter will
be droppepd by svc_destroy() later in lockd_up().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# dbf9b5d7 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: use existing per-net data function on service creation

This patch also replaces svc_rpcb_setup() with svc_bind().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 4db77695 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: pass service to per-net up and down functions

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 786185b5 03-May-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

SUNRPC: move per-net operations from svc_destroy()

The idea is to separate service destruction and per-net operations,
because these are two different things and the mix looks ugly.

Notes:

1) For NFS server this patch looks ugly (sorry for that). But these
place will be rewritten soon during NFSd containerization.

2) LockD per-net counter increase int lockd_up() was moved prior to
make_socks() to make lockd_down_net() call safe in case of error.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 9793f7c8 02-May-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

SUNRPC: new svc_bind() routine introduced

This new routine is responsible for service registration in a specified
network context.

The idea is to separate service creation from per-net operations.

Note also: since registering service with svc_bind() can fail, the
service will be destroyed and during destruction it will try to
unregister itself from rpcbind. In this case unregistration has to be
skipped.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# e3f70ead 29-Mar-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: pass network namespace to creation and destruction routines

v2: dereference of most probably already released nlm_host removed in
nlmclnt_done() and reclaimer().

These routines are called from locks reclaimer() kernel thread. This thread
works in "init_net" network context and currently relays on persence on lockd
thread and it's per-net resources. Thus lockd_up() and lockd_down() can't relay
on current network context. So let's pass corrent one into them.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 1df00640 21-Mar-2012 J. Bruce Fields <bfields@redhat.com>

Merge nfs containerization work from Trond's tree

The nfs containerization work is a prerequisite for Jeff Layton's reboot
recovery rework.


# de5b8e8e 06-Feb-2012 NeilBrown <neilb@suse.de>

lockd: fix arg parsing for grace_period and timeout.

If you try to set grace_period or timeout via a module parameter
to lockd, and do this on a big-endian machine where

sizeof(int) != sizeof(unsigned long)

it won't work. This number given will be effectively shifted right
by the difference in those two sizes.

So cast kp->arg properly to get correct result.

Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 3b64739f 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: shutdown NLM hosts in network namespace context

Lockd now managed in network namespace context. And this patch introduces
network namespace related NLM hosts shutdown in case of releasing per-net Lockd
resources.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# bb2224df 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: per-net up and down routines introduced

This patch introduces per-net Lockd initialization and destruction routines.
The logic is the same as in global Lockd up and down routines. Probably the
solution is not the best one. But at least it looks clear.
So per-net "up" routine are called only in case of lockd is running already. If
per-net resources are not allocated yet, then service is being registered with
local portmapper and lockd sockets created.
Per-net "down" routine is called on every lockd_down() call in case of global
users counter is not zero.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# a9c5d73a 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: pernet usage counter introduced

Lockd is going to be shared between network namespaces - i.e. going to be able
to handle lock requests from different network namespaces. This means, that
network namespace related resources have to be allocated not once (like now),
but for every network namespace context, from which service is requested to
operate.
This patch implements Lockd per-net users accounting. New per-net counter is
used to determine, when per-net resources have to be freed.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# c228fa20 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: create permanent lockd sockets in current network namespace

This patch parametrizes Lockd permanent sockets creation routine by network
namespace context.
It also replaces hard-coded init_net with current network namespace context in
Lockd sockets creation routines.
This approach looks safe, because Lockd is created during NFS mount (or NFS
server start) and thus socket is required exactly in current network namespace
context. But in the same time it means, that Lockd sockets inherits first Lockd
requester network namespace. This issue will be fixed in further patches of the
series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# 4cb54ca2 20-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

SUNRPC: search for service transports in network namespace context

Service transports are parametrized by network namespace. And thus lookup of
transport instance have to take network namespace into account.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>

# 11fd165c 28-Jul-2011 Eric Dumazet <eric.dumazet@gmail.com>

sunrpc: use better NUMA affinities

Use NUMA aware allocations to reduce latencies and increase throughput.

sunrpc kthreads can use kthread_create_on_node() if pool_mode is
"percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can
also take into account NUMA node affinity for memory allocations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: "J. Bruce Fields" <bfields@fieldses.org>
CC: Neil Brown <neilb@suse.de>
CC: David Miller <davem@davemloft.net>
Reviewed-by: Greg Banks <gnb@fastmail.fm>
[bfields@redhat.com: fix up caller nfs41_callback_up]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 763641d8 26-Oct-2010 Arnd Bergmann <arnd@arndb.de>

lockd: push lock_flocks down

lockd should use lock_flocks() instead of lock_kernel()
to lock against posix locks accessing the i_flock list.

This is a prerequisite to turning lock_flocks into a
spinlock.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: J. Bruce Fields <bfields@redhat.com>

# fc5d00b0 29-Sep-2010 Pavel Emelyanov <xemul@parallels.com>

sunrpc: Add net argument to svc_create_xprt

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>

# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

# d6783b2b 26-Jan-2010 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt()

Clean up: Bruce observed we have more or less common logic in each of
svc_create_xprt()'s callers: the check to create an IPv6 RPC listener
socket only if CONFIG_IPV6 is set. I'm about to add another case
that does just the same.

If we move the ifdefs into __svc_xpo_create(), then svc_create_xprt()
call sites can get rid of the "#ifdef" ugliness, and can use the same
logic with or without IPv6 support available in the kernel.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# 6d456111 16-Nov-2009 Eric W. Biederman <ebiederm@xmission.com>

sysctl: Drop & in front of every proc_handler.

For consistency drop & in front of every proc_handler. Explicity
taking the address is unnecessary and it prevents optimizations
like stubbing the proc_handlers to NULL.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

# ab09203e 05-Nov-2009 Eric W. Biederman <ebiederm@xmission.com>

sysctl fs: Remove dead binary sysctl support

Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name
and .strategy members of sysctl tables are dead code. Remove them.

Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

# 89996df4 06-May-2009 J. Bruce Fields <bfields@citi.umich.edu>

lockd: fix list corruption on lockd restart

If lockd is signalled soon enough after restart then locks_start_grace()
will try to re-add an entry to a list and trigger a lock corruption
warning.

Thanks to Wang Chen for the problem report and diagnosis.

WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c()
...
list_add corruption. next->prev should be prev (ef8fe958), but was ef8ff128. (next=ef8ff128).
...
Pid: 23062, comm: lockd Tainted: G W 2.6.30-rc2 #3
Call Trace:
[<c042d5b5>] warn_slowpath+0x71/0xa0
[<c0422a96>] ? update_curr+0x11d/0x125
[<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150
[<c044b270>] ? trace_hardirqs_on+0xb/0xd
[<c051c61a>] ? _raw_spin_lock+0x53/0xfa
[<c051c89f>] __list_add+0x27/0x5c
[<ef8f6daa>] locks_start_grace+0x22/0x30 [lockd]
[<ef8f34da>] set_grace_period+0x39/0x53 [lockd]
[<c06b8921>] ? lock_kernel+0x1c/0x28
[<ef8f3558>] lockd+0x64/0x164 [lockd]
[<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150
[<c04227b0>] ? complete+0x34/0x3e
[<ef8f34f4>] ? lockd+0x0/0x164 [lockd]
[<ef8f34f4>] ? lockd+0x0/0x164 [lockd]
[<c043dd42>] kthread+0x45/0x6b
[<c043dcfd>] ? kthread+0x0/0x6b
[<c0403c23>] kernel_thread_helper+0x7/0x10

Reported-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: stable@kernel.org

# eb16e907 18-Mar-2009 Chuck Lever <chuck.lever@oracle.com>

lockd: Start PF_INET6 listener only if IPv6 support is available

Apparently a lot of people need to disable IPv6 completely on their
distributor-built systems, which have CONFIG_IPV6_MODULE enabled at
build time.

They do this by blacklisting the ipv6.ko module. This causes the
creation of the lockd service listener to fail if CONFIG_IPV6_MODULE
is set, but the module cannot be loaded.

Now that the kernel's PF_INET6 RPC listeners are completely separate
from PF_INET listeners, we can always start PF_INET. Then lockd can
try to start PF_INET6, but it isn't required to be available.

Note this has the added benefit that NLM callbacks from AF_INET6
servers will never come from AF_INET remotes. We no longer have to
worry about matching mapped IPv4 addresses to AF_INET when comparing
addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# 26298caa 18-Mar-2009 Chuck Lever <chuck.lever@oracle.com>

NFS: Revert creation of IPv6 listeners for lockd and NFSv4 callbacks

We're about to convert over to using separate PF_INET and PF_INET6
listeners, instead of a single PF_INET6 listener that also receives
AF_INET requests and maps them to AF_INET6.

Clear the way by removing the logic in lockd and the NFSv4 callback
server that creates an AF_INET6 service listener.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# 49a9072f 18-Mar-2009 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Remove @family argument from svc_create() and svc_create_pooled()

Since an RPC service listener's protocol family is specified now via
svc_create_xprt(), it no longer needs to be passed to svc_create() or
svc_create_pooled(). Remove that argument from the synopsis of those
functions, and remove the sv_family field from the svc_serv struct.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# 9652ada3 18-Mar-2009 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Change svc_create_xprt() to take a @family argument

The sv_family field is going away. Pass a protocol family argument to
svc_create_xprt() instead of extracting the family from the passed-in
svc_serv struct.

Again, as this is a listener socket and not an address, we make this
new argument an "int" protocol family, instead of an "sa_family_t."

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# 0dba7c2a 31-Dec-2008 Chuck Lever <chuck.lever@oracle.com>

NLM: Clean up flow of control in make_socks() function

Clean up: Use Bruce's preferred control flow style in make_socks().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# d3fe5ea7 31-Dec-2008 Chuck Lever <chuck.lever@oracle.com>

NLM: Refactor make_socks() function

Clean up: extract common logic in NLM's make_socks() function
into a helper.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# b064ec03 11-Dec-2008 Chuck Lever <chuck.lever@oracle.com>

lockd: Enable NLM use of AF_INET6

If the kernel is configured to support IPv6 and the RPC server can register
services via rpcbindv4, we are all set to enable IPv6 support for lockd.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aime Le Rouzic <aime.le-rouzic@bull.net>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# b7ba597f 11-Dec-2008 Chuck Lever <chuck.lever@oracle.com>

NSM: Move nsm_use_hostnames to mon.c

Clean up.

Treat the nsm_use_hostnames global variable like nsm_local_state.
Note that the default value of nsm_use_hostnames is still zero.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# e6765b83 11-Dec-2008 Chuck Lever <chuck.lever@oracle.com>

NSM: Remove include/linux/lockd/sm_inter.h

Clean up: The include/linux/lockd/sm_inter.h header is nearly empty
now. Remove it.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# c72a476b 20-Oct-2008 Jeff Layton <jlayton@kernel.org>

lockd: set svc_serv->sv_maxconn to a more reasonable value (try #3)

The default method for calculating the number of connections allowed
per RPC service arbitrarily limits single-threaded services to 80
connections. This is too low for services like lockd and artificially
limits the number of TCP clients that it can support.

Have lockd set a default sv_maxconn value to 1024 (which is the typical
default value for RLIMIT_NOFILE. Also add a module parameter to allow an
admin to set this to an arbitrary value.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# 2de59872 23-Dec-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

LOCKD: Make lockd_up() and lockd_down() exported GPL-only

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# 2c5e7615 20-Nov-2008 J. Bruce Fields <bfields@citi.umich.edu>

nfsd: clean up grace period on early exit

If nfsd was shut down before the grace period ended, we could end up
with a freed object still on grace_list. Thanks to Jeff Moyer for
reporting the resulting list corruption warnings.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tested-by: Jeff Moyer <jmoyer@redhat.com>

# 26a41409 03-Oct-2008 Chuck Lever <chuck.lever@oracle.com>

NLM: Remove "proto" argument from lockd_up()

Clean up: Now that lockd_up() starts listeners for both transports, the
"proto" argument is no longer needed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# 8c3916f4 03-Oct-2008 Chuck Lever <chuck.lever@oracle.com>

NLM: Always start both UDP and TCP listeners

Commit 24e36663, which first appeared in 2.6.19, changed lockd so that
the client side starts a UDP listener only if there is a UDP NFSv2/v3
mount. Its description notes:

This... means that lockd will *not* listen on UDP if the only
mounts are TCP mount (and nfsd hasn't started).

The latter is the only one that concerns me at all - I don't know
if this might be a problem with some servers.

Unfortunately it is a problem for Linux itself. The rpc.statd daemon
on Linux uses UDP for contacting the local lockd, no matter which
protocol is used for NFS mounts. Without a local lockd UDP listener,
NFSv2/v3 lock recovery from Linux NFS clients always fails.

Revert parts of commit 24e36663 so lockd_up() always starts both
listeners.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# af558e33 05-Sep-2007 J. Bruce Fields <bfields@citi.umich.edu>

nfsd: common grace period control

Rewrite grace period code to unify management of grace period across
lockd and nfsd. The current code has lockd and nfsd cooperate to
compute a grace period which is satisfactory to them both, and then
individually enforce it. This creates a slight race condition, since
the enforcement is not coordinated. It's also more complicated than
necessary.

Here instead we have lockd and nfsd each inform common code when they
enter the grace period, and when they're ready to leave the grace
period, and allow normal locking only after both of them are ready to
leave.

We also expect the locks_start_grace()/locks_end_grace() interface here
to be simpler to build on for future cluster/high-availability work,
which may require (for example) putting individual filesystems into
grace, or enforcing grace periods across multiple cluster nodes.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# c8ab5f2a 18-Mar-2008 J. Bruce Fields <bfields@citi.umich.edu>

lockd: don't depend on lockd main loop to end grace

End lockd's grace period using schedule_delayed_work() instead of a
check on every pass through the main loop.

After a later patch, we'll depend on lockd to end its grace period even
if it's not currently handling requests; so it shouldn't depend on being
woken up from the main loop to do so.

Also, Nakano Hiroaki (who independently produced a similar patch)
noticed that the current behavior is buggy in the face of jiffies
wraparound:

"lockd uses time_before() to determine whether the grace period
has expired. This would seem to be enough to avoid timer
wrap-around issues, but, unfortunately, that is not the case.
The time_* family of comparison functions can be safely used to
compare jiffies relatively close in time, but they stop working
after approximately LONG_MAX/2 ticks. nfsd can suffer this
problem because the time_before() comparison in lockd() is not
performed until the first request comes in, which means that if
there is no lockd traffic for more than LONG_MAX/2 ticks we are
screwed.

"The implication of this is that once time_before() starts
misbehaving any attempt from a NFS client to execute fcntl()
will be received with a NLM_LCK_DENIED_GRACE_PERIOD message for
25 days (assuming HZ=1000). In other words, the 50 seconds grace
period could turn into a grace period of 50 days or more.

"Note: This bug was analyzed independently by Oda-san
<oda@valinux.co.jp> and myself."

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Nakano Hiroaki <nakano.hiroaki@oss.ntt.co.jp>
Cc: Itsuro Oda <oda@valinux.co.jp>

# 8fafa900 24-Jan-2008 J. Bruce Fields <bfields@citi.umich.edu>

locks: allow lockd to process blocked locks during grace period

The check here is currently harmless but unnecessary, since, as the
comment notes, there aren't any blocked-lock callbacks to process
during the grace period anyway.

And eventually we want to allow multiple grace periods that come and go
for different filesystems over the course of the lifetime of lockd, at
which point this check is just going to get in the way.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# e851db5b 30-Jun-2008 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Add address family field to svc_serv data structure

Introduce and initialize an address family field in the svc_serv structure.

This field will determine what family to use for the service's listener
sockets and what families are advertised via the local rpcbind daemon.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# abd1ec4e 11-Jun-2008 Jeff Layton <jlayton@kernel.org>

lockd: close potential race with rapid lockd_up/lockd_down cycle

If lockd_down is called very rapidly after lockd_up returns, then
there is a slim chance that lockd() will never be called. kthread()
will return before calling the function, so we'll end up never
actually calling the cleanup functions for the thread.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# 563307b2 24-Apr-2008 Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6

* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (80 commits)
SUNRPC: Invalidate the RPCSEC_GSS session if the server dropped the request
make nfs_automount_list static
NFS: remove duplicate flags assignment from nfs_validate_mount_data
NFS - fix potential NULL pointer dereference v2
SUNRPC: Don't change the RPCSEC_GSS context on a credential that is in use
SUNRPC: Fix a race in gss_refresh_upcall()
SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests
SUNRPC: Remove the unused export of xprt_force_disconnect
SUNRPC: remove XS_SENDMSG_RETRY
SUNRPC: Protect creds against early garbage collection
NFSv4: Attempt to use machine credentials in SETCLIENTID calls
NFSv4: Reintroduce machine creds
NFSv4: Don't use cred->cr_ops->cr_name in nfs4_proc_setclientid()
nfs: fix printout of multiword bitfields
nfs: return negative error value from nfs{,4}_stat_to_errno
NLM/lockd: Ensure client locking calls use correct credentials
NFS: Remove the buggy lock-if-signalled case from do_setlk()
NLM/lockd: Fix a race when cancelling a blocking lock
NLM/lockd: Ensure that nlmclnt_cancel() returns results of the CANCEL call
NLM: Remove the signal masking in nlmclnt_proc/nlmclnt_cancel
...


# f97c650d 08-Apr-2008 Jeff Layton <jlayton@kernel.org>

NLM: don't let lockd exit on unexpected svc_recv errors (try #2)

When svc_recv returns an unexpected error, lockd will print a warning
and exit. This problematic for several reasons. In particular, it will
cause the reference counts for the thread to be wrong, and can lead to a
potential BUG() call.

Rather than exiting on error from svc_recv, have the thread do a 1s
sleep and then retry the loop. This is unlikely to cause any harm, and
if the error turns out to be something temporary then it may be able to
recover.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# d751a7cd 07-Feb-2008 Jeff Layton <jlayton@kernel.org>

NLM: Convert lockd to use kthreads

Have lockd_up start lockd using kthread_run. With this change,
lockd_down now blocks until lockd actually exits, so there's no longer
need for the waitqueue code at the end of lockd_down. This also means
that only one lockd can be running at a time which simplifies the code
within lockd's main loop.

This also adds a check for kthread_should_stop in the main loop of
nlmsvc_retry_blocked and after that function returns. There's no sense
continuing to retry blocks if lockd is coming down anyway.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# 90d5b180 14-Mar-2008 Chuck Lever <chuck.lever@oracle.com>

NLM: LOCKD fails to load if CONFIG_SYSCTL is not set

Bruce Fields says:
"By the way, we've got another config-related nit here:

http://bugzilla.linux-nfs.org/show_bug.cgi?id=156

You can build lockd without CONFIG_SYSCTL set, but then the module will
fail to load."

For now, disable the sysctl registration calls in lockd if CONFIG_SYSCTL
is not enabled. This allows the kernel to build properly if PROC_FS or
SYSCTL is not enabled, but an NFS client is desired.

In the long run, we would like to be able to build the kernel with an
NFS client but without lockd. This makes sense, for example, if you want
an NFSv4-only NFS client, as NFSv4 doesn't use NLM at all.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# 5216a8e7 21-Feb-2008 Pavel Emelyanov <xemul@openvz.org>

Wrap buffers used for rpc debug printks into RPC_IFDEBUG

Sorry for the noise, but here's the v3 of this compilation fix :)

There are some places, which declare the char buf[...] on the stack
to push it later into dprintk(). Since the dprintk sometimes (if the
CONFIG_SYSCTL=n) becomes an empty do { } while (0) stub, these buffers
cause gcc to produce appropriate warnings.

Wrap these buffers with RPC_IFDEBUG macro, as Trond proposed, to
compile them out when not needed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# a217813f 30-Dec-2007 Tom Tucker <tom@opengridcomputing.com>

knfsd: Support adding transports by writing portlist file

Update the write handler for the portlist file to allow creating new
listening endpoints on a transport. The general form of the string is:

<transport_name><space><port number>

For example:

echo "tcp 2049" > /proc/fs/nfsd/portlist

This is intended to support the creation of a listening endpoint for
RDMA transports without adding #ifdef code to the nfssvc.c file.

Transports can also be removed as follows:

'-'<transport_name><space><port number>

For example:

echo "-tcp 2049" > /proc/fs/nfsd/portlist

Attempting to add a listener with an invalid transport string results
in EPROTONOSUPPORT and a perror string of "Protocol not supported".

Attempting to remove an non-existent listener (.e.g. bad proto or port)
results in ENOTCONN and a perror string of
"Transport endpoint is not connected"

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# 7fcb98d5 30-Dec-2007 Tom Tucker <tom@opengridcomputing.com>

svc: Add svc API that queries for a transport instance

Add a new svc function that allows a service to query whether a
transport instance has already been created. This is used in lockd
to determine whether or not a transport needs to be created when
a lockd instance is brought up.

Specifying 0 for the address family or port is effectively a wild-card,
and will result in matching the first transport in the service's list
that has a matching class name.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# 7a182083 30-Dec-2007 Tom Tucker <tom@opengridcomputing.com>

svc: Make close transport independent

Move sk_list and sk_ready to svc_xprt. This involves close because these
lists are walked by svcs when closing all their transports. So I combined
the moving of these lists to svc_xprt with making close transport independent.

The svc_force_sock_close has been changed to svc_close_all and takes a list
as an argument. This removes some svc internals knowledge from the svcs.

This code races with module removal and transport addition.

Thanks to Simon Holm Thøgersen for a compile fix.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Simon Holm Thøgersen <odie@cs.aau.dk>

# d7c9f1ed 30-Dec-2007 Tom Tucker <tom@opengridcomputing.com>

svc: Change services to use new svc_create_xprt service

Modify the various kernel RPC svcs to use the svc_create_xprt service.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

# 9a8db97e 17-Jul-2007 Marc Eshel <eshel@almaden.ibm.com>

knfsd: lockd: nfsd4: use same grace period for lockd and nfsd4

Both lockd and (in the nfsv4 case) nfsd enforce a "grace period" after reboot,
during which clients may reclaim locks from the previous server instance, but
may not acquire new locks.

Currently the lockd and nfsd enforce grace periods of different lengths. This
may cause problems when we reboot a server with both v2/v3 and v4 clients.
For example, if the lockd grace period is shorter (as is likely the case),
then a v3 client might acquire a new lock that conflicts with a lock already
held (but not yet reclaimed) by a v4 client.

This patch calculates a lease time that lockd and nfsd can both use.

Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

# 83144186 17-Jul-2007 Rafael J. Wysocki <rjw@rjwysocki.net>

Freezer: make kernel threads nonfreezable by default

Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves. This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.

It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.

The patch causes all kernel threads to be nonfreezable by default (ie. to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE. It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear. Additionally, it updates documentation to
describe the freezing of tasks more accurately.

[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

# f61534df 14-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Remove redundant calls to rpciod_up()/rpciod_down()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# 405ae7d3 17-Feb-2007 Robert P. J. Day <rpjday@mindspring.com>

Replace remaining references to "driverfs" with "sysfs".

Globally, s/driverfs/sysfs/g.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

# 0b4d4147 14-Feb-2007 Eric W. Biederman <ebiederm@xmission.com>

[PATCH] sysctl: remove insert_at_head from register_sysctl

The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name. Which is
pain for caching and the proc interface never implemented.

I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.

So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Corey Minyard <minyard@acm.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

# ad06e4bd 12-Feb-2007 Chuck Lever <chuck.lever@oracle.com>

[PATCH] knfsd: SUNRPC: Add a function to format the address in an svc_rqst for printing

There are loads of places where the RPC server assumes that the rq_addr fields
contains an IPv4 address. Top among these are error and debugging messages
that display the server's IP address.

Let's refactor the address printing into a separate function that's smart
enough to figure out the difference between IPv4 and IPv6 addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

# 482fb94e 12-Feb-2007 Chuck Lever <chuck.lever@oracle.com>

[PATCH] knfsd: SUNRPC: allow creating an RPC service without registering with portmapper

Sometimes we need to create an RPC service but not register it with the local
portmapper. NFSv4 delegation callback, for example.

Change the svc_makesock() API to allow optionally creating temporary or
permanent sockets, optionally registering with the local portmapper, and make
it return the ephemeral port of the new socket.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

# 7cc13edc 06-Nov-2006 Eric W. Biederman <ebiederm@xmission.com>

[PATCH] sysctl: implement CTL_UNNUMBERED

This patch takes the CTL_UNNUMBERD concept from NFS and makes it available to
all new sysctl users.

At the same time the sysctl binary interface maintenance documentation is
updated to mention and to describe what is needed to successfully maintain the
sysctl binary interface.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# 460f5cac 04-Oct-2006 Olaf Kirch <okir@suse.de>

[PATCH] knfsd: export nsm_local_state to user space via sysctl

Every NLM call includes the client's NSM state. Currently, the Linux client
always reports 0 - which seems not to cause any problems, but is not what the
protocol says.

This patch exposes the kernel's internal variable to user space via a sysctl,
which can be set at system boot time by statd.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# abd1f500 04-Oct-2006 Olaf Kirch <okir@suse.de>

[PATCH] knfsd: lockd: optionally use hostnames for identifying peers

This patch adds the nsm_use_hostnames sysctl and module param. If set, lockd
will use the client's name (as given in the NLM arguments) to find the NSM
handle. This makes recovery work when the NFS peer is multi-homed, and the
reboot notification arrives from a different IP than the original lock calls.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# 4a3ae42d 02-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: Correctly handle error condition from lockd_up

If lockd_up fails - what should we expect? Do we have to later call
lockd_down?

Well the nfs client thinks "no", the nfs server thinks "yes". lockd thinks
"yes".

The only answer that really makes sense is "no" !!

So:
Make lockd_up only increment nlmsvc_users on success.
Make nfsd handle errors from lockd_up properly.
Make sure lockd_up(0) never fails when lockd is running
so that the 'reclaimer' call to lockd_up doesn't need to
be error checked.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# 7dcf91ec 02-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: Move makesock failed warning into make_socks.

Thus it is printed for any path that leads to failure (make_socks is called
from two places).

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# 6fb2b47f 02-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: Drop 'serv' option to svc_recv and svc_process

It isn't needed as it is available in rqstp->rq_server, and dropping it allows
some local vars to be dropped.

[akpm@osdl.org: build fix]
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# 24e36663 02-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: be more selective in which sockets lockd listens on

Currently lockd listens on UDP always, and TCP if CONFIG_NFSD_TCP is set.

However as lockd performs services of the client as well, this is a problem.
If CONFIG_NfSD_TCP is not set, and a tcp mount is used, the server will not be
able to call back to lockd.

So:
- add an option to lockd_up saying which protocol is needed
- Always open sockets for which an explicit port was given, otherwise
only open a socket of the type required
- Change nfsd to do one lockd_up per socket rather than one per thread.

This
- removes the dependancy on CONFIG_NFSD_TCP
- means that lockd may open sockets other than at startup
- means that lockd will *not* listen on UDP if the only
mounts are TCP mount (and nfsd hasn't started).

The latter is the only one that concerns me at all - I don't know if this
might be a problem with some servers.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# bc591ccf 02-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: add a callback for when last rpc thread finishes

nfsd has some cleanup that it wants to do when the last thread exits, and
there will shortly be some more. So collect this all into one place and
define a callback for an rpc service to call when the service is about to be
destroyed.

[akpm@osdl.org: cleanups, build fix]
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# 6ab3d562 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de>

Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

# 353ab6e9 26-Mar-2006 Ingo Molnar <mingo@elte.hu>

[PATCH] sem2mutex: fs/

Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org>
Cc: Robert Love <rml@tech9.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# e8c96f8c 24-Mar-2006 Tobias Klauser <tklauser@nuerscht.ch>

[PATCH] fs: Use ARRAY_SIZE macro

Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a
duplicate of ARRAY_SIZE. Some trailing whitespaces are also deleted.

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# 2bd61579 03-Jan-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Ensure that SIGKILL will always terminate a synchronous RPC call.

...and make sure that the "intr" flag also enables SIGHUP and SIGTERM to
interrupt RPC calls too (as per the Solaris implementation).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>

# 7ee91ec1 13-Jul-2005 Steve Dickson <SteveD@redhat.com>

[PATCH] NFS: procfs/sysctl interfaces for lockd do not work on x86_64

Allow the setting of NLM timeouts and grace periods through the proc and
sysclt interfaces on x86_64 architectures

Signed-off-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# 46be925f 23-Jun-2005 NeilBrown <neilb@cse.unsw.edu.au>

[PATCH] knfsd: lockd: flush signals on shutdown

Silence another annoying "failed to contact portmap (errno -512)" on shutdown.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

# 1da177e4 16-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org>

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

# 6b044fba 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: use svc_set_num_threads() for thread start and stop

svc_set_num_threads() does everything that lockd_start_svc() does, except
set sv_maxconn. It also (when passed 0) finds the threads and
stops them with kthread_stop().

So move the setting for sv_maxconn, and use svc_set_num_thread()

We now don't need nlmsvc_task.

Now that we use svc_set_num_threads() it makes sense to set svo_module.
This request that the thread exists with module_put_and_exit().
Also fix the documentation for svo_module to make this explicit.

svc_prepare_thread is now only used where it is defined, so it can be
made static.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# ecd3ad68 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: rename lockd_create_svc() to lockd_get()

lockd_create_svc() already does an svc_get() if the service already
exists, so it is more like a "get" than a "create".

So:
- Move the increment of nlmsvc_users into the function as well
- rename to lockd_get().

It is now the inverse of lockd_put().

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 865b6740 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: introduce lockd_put()

There is some cleanup that is duplicated in lockd_down() and the failure
path of lockd_up().
Factor these out into a new lockd_put() and call it from both places.

lockd_put() does *not* take the mutex - that must be held by the caller.
It decrements nlmsvc_users and if that reaches zero, it cleans up.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 6a4e2527 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: move svc_exit_thread() into the thread

The normal place to call svc_exit_thread() is from the thread itself
just before it exists.
Do this for lockd.

This means that nlmsvc_rqst is not used out side of lockd_start_svc(),
so it can be made local to that function, and renamed to 'rqst'.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# b73a2972 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: move lockd_start_svc() call into lockd_create_svc()

lockd_start_svc() only needs to be called once, just after the svc is
created. If the start fails, the svc is discarded too.

It thus makes sense to call lockd_start_svc() from lockd_create_svc().
This allows us to remove the test against nlmsvc_rqst at the start of
lockd_start_svc() - it must always be NULL.

lockd_up() only held an extra reference on the svc until a thread was
created - then it dropped it. The thread - and thus the extra reference
- will remain until kthread_stop() is called.
Now that the thread is created in lockd_create_svc(), the extra
reference can be dropped there. So the 'serv' variable is no longer
needed in lockd_up().

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 5a8a7ff5 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: simplify management of network status notifiers

Now that the network status notifiers use nlmsvc_serv rather then
nlmsvc_rqst the management can be simplified.

Notifier unregistration synchronises with any pending notifications so
providing we unregister before nlm_serv is freed no further interlock
is required.

So we move the unregister call to just before the thread is killed
(which destroys the service) and just before the service is destroyed in
the failure-path of lockd_up().

Then nlm_ntf_refcnt and nlm_ntf_wq can be removed.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 2840fe86 28-Nov-2021 NeilBrown <neilb@suse.de>

lockd: introduce nlmsvc_serv

lockd has two globals - nlmsvc_task and nlmsvc_rqst - but mostly it
wants the 'struct svc_serv', and when it doesn't want it exactly it can
get to what it wants from the serv.

This patch is a first step to removing nlmsvc_task and nlmsvc_rqst. It
introduces nlmsvc_serv to store the 'struct svc_serv*'. This is set as
soon as the serv is created, and cleared only when it is destroyed.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# ec52361d 28-Nov-2021 NeilBrown <neilb@suse.de>

SUNRPC: stop using ->sv_nrthreads as a refcount

The use of sv_nrthreads as a general refcount results in clumsy code, as
is seen by various comments needed to explain the situation.

This patch introduces a 'struct kref' and uses that for reference
counting, leaving sv_nrthreads to be a pure count of threads. The kref
is managed particularly in svc_get() and svc_put(), and also nfsd_put();

svc_destroy() now takes a pointer to the embedded kref, rather than to
the serv.

nfsd allows the svc_serv to exist with ->sv_nrhtreads being zero. This
happens when a transport is created before the first thread is started.
To support this, a 'keep_active' flag is introduced which holds a ref on
the svc_serv. This is set when any listening socket is successfully
added (unless there are running threads), and cleared when the number of
threads is set. So when the last thread exits, the nfs_serv will be
destroyed.
The use of 'keep_active' replaces previous code which checked if there
were any permanent sockets.

We no longer clear ->rq_server when nfsd() exits. This was done
to prevent svc_exit_thread() from calling svc_destroy().
Instead we take an extra reference to the svc_serv to prevent
svc_destroy() from being called.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# 8c62d127 28-Nov-2021 NeilBrown <neilb@suse.de>

SUNRPC/NFSD: clean up get/put functions.

svc_destroy() is poorly named - it doesn't necessarily destroy the svc,
it might just reduce the ref count.
nfsd_destroy() is poorly named for the same reason.

This patch:
- removes the refcount functionality from svc_destroy(), moving it to
a new svc_put(). Almost all previous callers of svc_destroy() now
call svc_put().
- renames nfsd_destroy() to nfsd_put() and improves the code, using
the new svc_destroy() rather than svc_put()
- removes a few comments that explain the important for balanced
get/put calls. This should be obvious.

The only non-trivial part of this is that svc_destroy() would call
svc_sock_update() on a non-final decrement. It can no longer do that,
and svc_put() isn't really a good place of it. This call is now made
from svc_exit_thread() which seems like a good place. This makes the
call *before* sv_nrthreads is decremented rather than after. This
is not particularly important as the call just sets a flag which
causes sv_nrthreads set be checked later. A subsequent patch will
improve the ordering.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# df5e49c8 28-Nov-2021 NeilBrown <neilb@suse.de>

SUNRPC: change svc_get() to return the svc.

It is common for 'get' functions to return the object that was 'got',
and there are a couple of places where users of svc_get() would be a
little simpler if svc_get() did that.

Make it so.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# fda49441 13-Oct-2021 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Replace the "__be32 *p" parameter to .pc_encode

The passed-in value of the "__be32 *p" parameter is now unused in
every server-side XDR encoder, and can be removed.

Note also that there is a line in each encoder that sets up a local
pointer to a struct xdr_stream. Passing that pointer from the
dispatcher instead saves one line per encoder function.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 16c66364 12-Oct-2021 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Replace the "__be32 *p" parameter to .pc_decode

The passed-in value of the "__be32 *p" parameter is now unused in
every server-side XDR decoder, and can be removed.

Note also that there is a line in each decoder that sets up a local
pointer to a struct xdr_stream. Passing that pointer from the
dispatcher instead saves one line per decoder function.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 5c2465df 15-Jul-2021 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Set rq_auth_stat in the pg_authenticate() callout

In a few moments, rq_auth_stat will need to be explicitly set to
rpc_auth_ok before execution gets to the dispatcher.

svc_authenticate() already sets it, but it often gets reset to
rpc_autherr_badcred right after that call, even when authentication
is successful. Let's ensure that the pg_authenticate callout and
svc_set_client() set it properly in every case.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# d02a3a2c 02-Aug-2021 Jia He <hejianet@gmail.com>

lockd: change the proc_handler for nsm_use_hostnames

nsm_use_hostnames is a module parameter and it will be exported to sysctl
procfs. This is to let user sometimes change it from userspace. But the
minimal unit for sysctl procfs read/write it sizeof(int).
In big endian system, the converting from/to bool to/from int will cause
error for proc items.

This patch use a new proc_handler proc_dobool to fix it.

Signed-off-by: Jia He <hejianet@gmail.com>
Reviewed-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
[thuth: Fix typo in commit message]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>


# a9ad1a80 03-Jun-2021 Chuck Lever <chuck.lever@oracle.com>

lockd: Create a simplified .vs_dispatch method for NLM requests

To enable xdr_stream-based encoding and decoding, create a bespoke
RPC dispatch function for the lockd service.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 09c434b8 19-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Add SPDX license identifier for more missed files

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have MODULE_LICENCE("GPL*") inside which was used in the initial
scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 40373b12 08-Apr-2019 Trond Myklebust <trondmy@gmail.com>

lockd: Pass the user cred from knfsd when starting the lockd server

When starting up a new knfsd server, pass the user cred to the
supporting lockd server.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 4df493a2 08-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Cache the process user cred in the RPC server listener

In order to be able to interpret uids and gids correctly in knfsd, we
should cache the user namespace of the process that created the RPC
server's listener. To do so, we refcount the credential of that process.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 642ee6b2 09-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Allow further customisation of RPC program registration

Add a callback to allow customisation of the rpcbind registration.
When clients have the ability to turn on and off version support,
we want to allow them to also prevent registration of those
versions with the rpc portmapper.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 8e5b6773 09-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Add a callback to initialise server requests

Add a callback to help initialise server requests before they are
processed. This will allow us to clean up the NFS server version
support, and to make it container safe.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 2f635cee 27-Mar-2018 Kirill Tkhai <ktkhai@virtuozzo.com>

net: Drop pernet_operations::async

Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 554faf28 07-Feb-2018 Colin Ian King <colin.king@canonical.com>

lockd: make nlm_ntf_refcnt and nlm_ntf_wq static

The variables nlm_ntf_refcnt and nlm_ntf_wq are local to the source and
do not need to be in global scope, so make them static.

Cleans up sparse warnings:
fs/lockd/svc.c:60:10: warning: symbol 'nlm_ntf_refcnt' was not declared.
Should it be static?
fs/lockd/svc.c:61:1: warning: symbol 'nlm_ntf_wq' was not declared.
Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 02df428c 26-Feb-2018 Kirill Tkhai <ktkhai@virtuozzo.com>

net: Convert simple pernet_operations

These pernet_operations make pretty simple actions
like variable initialization on init, debug checks
on exit, and so on, and they obviously are able
to be executed in parallel with any others:

vrf_net_ops
lockd_net_ops
grace_net_ops
xfrm6_tunnel_net_ops
kcm_net_ops
tcf_net_ops

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6b18dd1c 10-Nov-2017 Vasily Averin <vvs@virtuozzo.com>

race of lockd inetaddr notifiers vs nlmsvc_rqst change

lockd_inet[6]addr_event use nlmsvc_rqst without taken nlmsvc_mutex,
nlmsvc_rqst can be changed during execution of notifiers and crash the host.

Patch enables access to nlmsvc_rqst only when it was correctly initialized
and delays its cleanup until notifiers are no longer in use.

Note that nlmsvc_rqst can be temporally set to ERR_PTR, so the "if
(nlmsvc_rqst)" check in notifiers is insufficient on its own.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 3a2b19d1 02-Nov-2017 Vasily Averin <vvs@virtuozzo.com>

lockd: lost rollback of set_grace_period() in lockd_down_net()

Commit efda760fe95ea ("lockd: fix lockd shutdown race") is incorrect,
it removes lockd_manager and disarm grace_period_end for init_net only.

If nfsd was started from another net namespace lockd_up_net() calls
set_grace_period() that adds lockd_manager into per-netns list
and queues grace_period_end delayed work.

These action should be reverted in lockd_down_net().
Otherwise it can lead to double list_add on after restart nfsd in netns,
and to use-after-free if non-disarmed delayed work will be executed after netns destroy.

Fixes: efda760fe95e ("lockd: fix lockd shutdown race")
Cc: stable@vger.kernel.org
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# a3152f14 06-Nov-2017 Vasily Averin <vvs@virtuozzo.com>

lockd: added cleanup checks in exit_net hook

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# e919b076 07-Nov-2017 Vasily Averin <vvs@virtuozzo.com>

lockd: remove net pointer from messages

Publishing of net pointer is not safe,
use net->ns.inum as net ID in debug messages

[ 171.757678] lockd_up_net: per-net data created; net=f00001e7
[ 171.767188] NFSD: starting 90-second grace period (net f00001e7)
[ 300.653313] lockd: nuking all hosts in net f00001e7...
[ 300.653641] lockd: host garbage collection for net f00001e7
[ 300.653968] lockd: nlmsvc_mark_resources for net f00001e7
[ 300.711483] lockd_down_net: per-net data destroyed; net=f00001e7
[ 300.711847] lockd: nuking all hosts in net 0...
[ 300.711847] lockd: host garbage collection for net 0
[ 300.711848] lockd: nlmsvc_mark_resources for net 0

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# dc3033e1 20-Oct-2017 Vasily Averin <vvs@virtuozzo.com>

lockd: double unregister of inetaddr notifiers

lockd_up() can call lockd_unregister_notifiers twice:
inside lockd_start_svc() when it calls lockd_svc_exit_thread()
and then in error path of lockd_up()

Patch forces lockd_start_svc() to unregister notifiers in all error cases
and removes extra unregister in error path of lockd_up().

Fixes: cb7d224f82e4 "lockd: unregister notifier blocks if the service ..."
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# e4dca7b7 17-Oct-2017 Kees Cook <keescook@chromium.org>

treewide: Fix function prototypes for module_param_call()

Several function prototypes for the set/get functions defined by
module_param_call() have a slightly wrong argument types. This fixes
those in an effort to clean up the calls when running under type-enforced
compiler instrumentation for CFI. This is the result of running the
following semantic patch:

@match_module_param_call_function@
declarer name module_param_call;
identifier _name, _set_func, _get_func;
expression _arg, _mode;
@@

module_param_call(_name, _set_func, _get_func, _arg, _mode);

@fix_set_prototype
depends on match_module_param_call_function@
identifier match_module_param_call_function._set_func;
identifier _val, _param;
type _val_type, _param_type;
@@

int _set_func(
-_val_type _val
+const char * _val
,
-_param_type _param
+const struct kernel_param * _param
) { ... }

@fix_get_prototype
depends on match_module_param_call_function@
identifier match_module_param_call_function._get_func;
identifier _val, _param;
type _val_type, _param_type;
@@

int _get_func(
-_val_type _val
+char * _val
,
-_param_type _param
+const struct kernel_param * _param
) { ... }

Two additional by-hand changes are included for places where the above
Coccinelle script didn't notice them:

drivers/platform/x86/thinkpad_acpi.c
fs/lockd/svc.c

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jessica Yu <jeyu@kernel.org>


# afea5657 31-Jul-2017 Chuck Lever <chuck.lever@oracle.com>

sunrpc: Const-ify struct sv_serv_ops

Close an attack vector by moving the arrays of per-server methods to
read-only memory.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# aa8217d5 12-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: mark all struct svc_version instances as const

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 0becc118 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: move pc_count out of struct svc_procinfo

pc_count is the only writeable memeber of struct svc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct svc_procinfo, and into a
separate writable array that is pointed to by struct svc_version.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# e9679189 12-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: mark all struct svc_version instances as const

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 7fd38af9 08-May-2017 Christoph Hellwig <hch@lst.de>

sunrpc: move pc_count out of struct svc_procinfo

pc_count is the only writeable memeber of struct svc_procinfo, which is
a good candidate to be const-ified as it contains function pointers.

This patch moves it into out out struct svc_procinfo, and into a
separate writable array that is pointed to by struct svc_version.

Signed-off-by: Christoph Hellwig <hch@lst.de>


# efda760f 28-Mar-2017 J. Bruce Fields <bfields@redhat.com>

lockd: fix lockd shutdown race

As reported by David Jeffery: "a signal was sent to lockd while lockd
was shutting down from a request to stop nfs. The signal causes lockd
to call restart_grace() which puts the lockd_net structure on the grace
list. If this signal is received at the wrong time, it will occur after
lockd_down_net() has called locks_end_grace() but before
lockd_down_net() stops the lockd thread. This leads to lockd putting
the lockd_net structure back on the grace list, then exiting without
anything removing it from the list."

So, perform the final locks_end_grace() from the the lockd thread; this
ensures it's serialized with respect to restart_grace().

Reported-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 3f07c014 08-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>

We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# c01410f7 05-Jan-2017 Scott Mayhew <smayhew@redhat.com>

lockd: initialize sin6_scope_id in lockd_inet6addr_event()

I noticed this was missing when I was testing with link local addresses.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# c7d03a00 16-Nov-2016 Alexey Dobriyan <adobriyan@gmail.com>

netns: make struct pernet_operations::id unsigned int

Make struct pernet_operations::id unsigned.

There are 2 reasons to do so:

1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.

2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data.

"int" being used as an array index needs to be sign-extended
to 64-bit before being used.

void f(long *p, int i)
{
g(p[i]);
}

roughly translates to

movsx rsi, esi
mov rdi, [rsi+...]
call g

MOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.

Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:

static inline void *net_generic(const struct net *net, int id)
{
...
ptr = ng->ptr[id - 1];
...
}

And this function is used a lot, so those sign extensions add up.

Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):

add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]

However, overall balance is in negative direction:

add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
function old new delta
nfsd4_lock 3886 3959 +73
tipc_link_build_proto_msg 1096 1140 +44
mac80211_hwsim_new_radio 2776 2808 +32
tipc_mon_rcv 1032 1058 +26
svcauth_gss_legacy_init 1413 1429 +16
tipc_bcbase_select_primary 379 392 +13
nfsd4_exchange_id 1247 1260 +13
nfsd4_setclientid_confirm 782 793 +11
...
put_client_renew_locked 494 480 -14
ip_set_sockfn_get 730 716 -14
geneve_sock_add 829 813 -16
nfsd4_sequence_done 721 703 -18
nlmclnt_lookup_host 708 686 -22
nfsd4_lockt 1085 1063 -22
nfs_get_client 1077 1050 -27
tcf_bpf_init 1106 1076 -30
nfsd4_encode_fattr 5997 5930 -67
Total: Before=154856051, After=154854321, chg -0.00%

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# cb7d224f 30-Jun-2016 Scott Mayhew <smayhew@redhat.com>

lockd: unregister notifier blocks if the service fails to come up completely

If the lockd service fails to start up then we need to be sure that the
notifier blocks are not registered, otherwise a subsequent start of the
service could cause the same notifier to be registered twice, leading to
soft lockups.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 0751ddf77b6a "lockd: Register callbacks on the inetaddr_chain..."
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 2a297450 23-Dec-2015 Julia Lawall <Julia.Lawall@lip6.fr>

lockd: constify nlmsvc_binding structure

The nlmsvc_binding structure is never modified, so declare it as const.

Done with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# ea44463f 01-Jan-2016 Geliang Tang <geliangtang@163.com>

lockd: use to_delayed_work

Use to_delayed_work() instead of open-coding it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 0751ddf7 11-Dec-2015 Scott Mayhew <smayhew@redhat.com>

lockd: Register callbacks on the inetaddr_chain and inet6addr_chain

Register callbacks on inetaddr_chain and inet6addr_chain to trigger
cleanup of lockd transport sockets when an ip address is deleted.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 0d0f4aab 07-Oct-2015 Andrey Ryabinin <ryabinin.a.a@gmail.com>

lockd: get rid of reference-counted NSM RPC clients

Currently we have reference-counted per-net NSM RPC client
which created on the first monitor request and destroyed
after the last unmonitor request. It's needed because
RPC client need to know 'utsname()->nodename', but utsname()
might be NULL when nsm_unmonitor() called.

So instead of holding the rpc client we could just save nodename
in struct nlm_host and pass it to the rpc_create().
Thus ther is no need in keeping rpc client until last
unmonitor request. We could create separate RPC clients
for each monitor/unmonitor requests.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 0ad95472 23-Sep-2015 Andrey Ryabinin <ryabinin.a.a@gmail.com>

lockd: create NSM handles per net namespace

Commit cb7323fffa85 ("lockd: create and use per-net NSM
RPC clients on MON/UNMON requests") introduced per-net
NSM RPC clients. Unfortunately this doesn't make any sense
without per-net nsm_handle.

E.g. the following scenario could happen
Two hosts (X and Y) in different namespaces (A and B) share
the same nsm struct.

1. nsm_monitor(host_X) called => NSM rpc client created,
nsm->sm_monitored bit set.
2. nsm_mointor(host-Y) called => nsm->sm_monitored already set,
we just exit. Thus in namespace B ln->nsm_clnt == NULL.
3. host X destroyed => nsm->sm_count decremented to 1
4. host Y destroyed => nsm_unmonitor() => nsm_mon_unmon() => NULL-ptr
dereference of *ln->nsm_clnt

So this could be fixed by making per-net nsm_handles list,
instead of global. Thus different net namespaces will not be able
share the same nsm_handle.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# c87fb4a3 05-Aug-2015 J. Bruce Fields <bfields@redhat.com>

lockd: NLM grace period shouldn't block NFSv4 opens

NLM locks don't conflict with NFSv4 share reservations, so we're not
going to learn anything new by watiting for them.

They do conflict with NFSv4 locks and with delegations.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# b9e13cdf 08-Jun-2015 Jeff Layton <jlayton@kernel.org>

nfsd/sunrpc: turn enqueueing a svc_xprt into a svc_serv operation

For now, all services use svc_xprt_do_enqueue, but once we add
workqueue-based service support, we'll need to do something different.

Signed-off-by: Shirley Ma <shirley.ma@oracle.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# ea126e74 08-Jun-2015 Jeff Layton <jlayton@kernel.org>

nfsd/sunrpc: add a new svc_serv_ops struct and move sv_shutdown into it

In later patches we'll need to abstract out more operations on a
per-service level, besides sv_shutdown and sv_function.

Declare a new svc_serv_ops struct to hold these operations, and move
sv_shutdown into this struct.

Signed-off-by: Shirley Ma <shirley.ma@oracle.com>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Tested-by: Shirley Ma <shirley.ma@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 06bed7d1 02-Jan-2015 Trond Myklebust <trond.myklebust@primarydata.com>

LOCKD: Fix a race when initialising nlmsvc_timeout

This commit fixes a race whereby nlmclnt_init() first starts the lockd
daemon, and then calls nlm_bind_host() with the expectation that
nlmsvc_timeout has already been initialised. Unfortunately, there is no
no synchronisation between lockd() and lockd_up() to guarantee that this
is the case.

Fix is to move the initialisation of nlmsvc_timeout into lockd_create_svc

Fixes: 9a1b6bf818e74 ("LOCKD: Don't call utsname()->nodename...")
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org # 3.10.x
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 0b5707e4 19-Nov-2014 Jeff Layton <jlayton@kernel.org>

sunrpc: require svc_create callers to pass in meaningful shutdown routine

Currently all svc_create callers pass in NULL for the shutdown parm,
which then gets fixed up to be svc_rpcb_cleanup if the service uses
rpcbind.

Simplify this by instead having the the only caller that requires it
(lockd) pass in svc_rpcb_cleanup and get rid of the special casing.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# d68e3c4a 12-Sep-2014 Jeff Layton <jlayton@kernel.org>

lockd: add a /proc/fs/lockd/nlm_end_grace file

Add a new procfile that will allow a (privileged) userland process to
end the NLM grace period early. The basic idea here will be to have
sm-notify write to this file, if it sent out no NOTIFY requests when
it runs. In that situation, we can generally expect that there will be
no reclaim requests so the grace period can be lifted early.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>


# f7790029 12-Sep-2014 Jeff Layton <jlayton@kernel.org>

lockd: move lockd's grace period handling into its own module

Currently, all of the grace period handling is part of lockd. Eventually
though we'd like to be able to build v4-only servers, at which point
we'll need to put all of this elsewhere.

Move the code itself into fs/nfs_common and have it build a grace.ko
module. Then, rejigger the Kconfig options so that both nfsd and lockd
enable it automatically.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>


# 7c17705e 29-Aug-2014 J. Bruce Fields <bfields@redhat.com>

lockd: fix rpcbind crash on lockd startup failure

Nikita Yuschenko reported that booting a kernel with init=/bin/sh and
then nfs mounting without portmap or rpcbind running using a busybox
mount resulted in:

# mount -t nfs 10.30.130.21:/opt /mnt
svc: failed to register lockdv1 RPC service (errno 111).
lockd_up: makesock failed, error=-111
Unable to handle kernel paging request for data at address 0x00000030
Faulting instruction address: 0xc055e65c
Oops: Kernel access of bad area, sig: 11 [#1]
MPC85xx CDS
Modules linked in:
CPU: 0 PID: 1338 Comm: mount Not tainted 3.10.44.cge #117
task: cf29cea0 ti: cf35c000 task.ti: cf35c000
NIP: c055e65c LR: c0566490 CTR: c055e648
REGS: cf35dad0 TRAP: 0300 Not tainted (3.10.44.cge)
MSR: 00029000 <CE,EE,ME> CR: 22442488 XER: 20000000
DEAR: 00000030, ESR: 00000000

GPR00: c05606f4 cf35db80 cf29cea0 cf0ded80 cf0dedb8 00000001 1dec3086
00000000
GPR08: 00000000 c07b1640 00000007 1dec3086 22442482 100b9758 00000000
10090ae8
GPR16: 00000000 000186a5 00000000 00000000 100c3018 bfa46edc 100b0000
bfa46ef0
GPR24: cf386ae0 c07834f0 00000000 c0565f88 00000001 cf0dedb8 00000000
cf0ded80
NIP [c055e65c] call_start+0x14/0x34
LR [c0566490] __rpc_execute+0x70/0x250
Call Trace:
[cf35db80] [00000080] 0x80 (unreliable)
[cf35dbb0] [c05606f4] rpc_run_task+0x9c/0xc4
[cf35dbc0] [c0560840] rpc_call_sync+0x50/0xb8
[cf35dbf0] [c056ee90] rpcb_register_call+0x54/0x84
[cf35dc10] [c056f24c] rpcb_register+0xf8/0x10c
[cf35dc70] [c0569e18] svc_unregister.isra.23+0x100/0x108
[cf35dc90] [c0569e38] svc_rpcb_cleanup+0x18/0x30
[cf35dca0] [c0198c5c] lockd_up+0x1dc/0x2e0
[cf35dcd0] [c0195348] nlmclnt_init+0x2c/0xc8
[cf35dcf0] [c015bb5c] nfs_start_lockd+0x98/0xec
[cf35dd20] [c015ce6c] nfs_create_server+0x1e8/0x3f4
[cf35dd90] [c0171590] nfs3_create_server+0x10/0x44
[cf35dda0] [c016528c] nfs_try_mount+0x158/0x1e4
[cf35de20] [c01670d0] nfs_fs_mount+0x434/0x8c8
[cf35de70] [c00cd3bc] mount_fs+0x20/0xbc
[cf35de90] [c00e4f88] vfs_kern_mount+0x50/0x104
[cf35dec0] [c00e6e0c] do_mount+0x1d0/0x8e0
[cf35df10] [c00e75ac] SyS_mount+0x90/0xd0
[cf35df40] [c000ccf4] ret_from_syscall+0x0/0x3c

The addition of svc_shutdown_net() resulted in two calls to
svc_rpcb_cleanup(); the second is no longer necessary and crashes when
it calls rpcb_register_call with clnt=NULL.

Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
Fixes: 679b033df484 "lockd: ensure we tear down any live sockets when socket creation fails during lockd_up"
Cc: stable@vger.kernel.org
Acked-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# d4e89902 02-Sep-2014 Trond Myklebust <trond.myklebust@primarydata.com>

lockd: Do not start the lockd thread before we've set nlmsvc_rqst->rq_task

This fixes an Oopsable race when starting lockd.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# d6a7ce42 03-Aug-2014 Trond Myklebust <trond.myklebust@primarydata.com>

lockd: Ensure that lockd_start_svc sets the server rq_task...

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 7ac9fe57 06-Jun-2014 Joe Perches <joe@perches.com>

lockd: convert use of typedef ctl_table to struct ctl_table

This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 12dd7ecf 01-May-2014 Kees Cook <keescook@chromium.org>

lockd: avoid warning when CONFIG_SYSCTL undefined

When building without CONFIG_SYSCTL, the compiler saw an unused
label. This moves the label into the #ifdef it is used under.

fs/lockd/svc.c: In function ‘init_nlm’:
fs/lockd/svc.c:626:1: warning: label ‘err_sysctl’ defined but not used [-Wunused-label]

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 679b033d 25-Mar-2014 Jeff Layton <jlayton@kernel.org>

lockd: ensure we tear down any live sockets when socket creation fails during lockd_up

We had a Fedora ABRT report with a stack trace like this:

kernel BUG at net/sunrpc/svc.c:550!
invalid opcode: 0000 [#1] SMP
[...]
CPU: 2 PID: 913 Comm: rpc.nfsd Not tainted 3.13.6-200.fc20.x86_64 #1
Hardware name: Hewlett-Packard HP ProBook 4740s/1846, BIOS 68IRR Ver. F.40 01/29/2013
task: ffff880146b00000 ti: ffff88003f9b8000 task.ti: ffff88003f9b8000
RIP: 0010:[<ffffffffa0305fa8>] [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
RSP: 0018:ffff88003f9b9de0 EFLAGS: 00010206
RAX: ffff88003f829628 RBX: ffff88003f829600 RCX: 00000000000041ee
RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286
RBP: ffff88003f9b9de8 R08: 0000000000017360 R09: ffff88014fa97360
R10: ffffffff8114ce57 R11: ffffea00051c9c00 R12: ffff88003f829600
R13: 00000000ffffff9e R14: ffffffff81cc7cc0 R15: 0000000000000000
FS: 00007f4fde284840(0000) GS:ffff88014fa80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4fdf5192f8 CR3: 00000000a569a000 CR4: 00000000001407e0
Stack:
ffff88003f792300 ffff88003f9b9e18 ffffffffa02de02a 0000000000000000
ffffffff81cc7cc0 ffff88003f9cb000 0000000000000008 ffff88003f9b9e60
ffffffffa033bb35 ffffffff8131c86c ffff88003f9cb000 ffff8800a5715008
Call Trace:
[<ffffffffa02de02a>] lockd_up+0xaa/0x330 [lockd]
[<ffffffffa033bb35>] nfsd_svc+0x1b5/0x2f0 [nfsd]
[<ffffffff8131c86c>] ? simple_strtoull+0x2c/0x50
[<ffffffffa033c630>] ? write_pool_threads+0x280/0x280 [nfsd]
[<ffffffffa033c6bb>] write_threads+0x8b/0xf0 [nfsd]
[<ffffffff8114efa4>] ? __get_free_pages+0x14/0x50
[<ffffffff8114eff6>] ? get_zeroed_page+0x16/0x20
[<ffffffff811dec51>] ? simple_transaction_get+0xb1/0xd0
[<ffffffffa033c098>] nfsctl_transaction_write+0x48/0x80 [nfsd]
[<ffffffff811b8b34>] vfs_write+0xb4/0x1f0
[<ffffffff811c3f99>] ? putname+0x29/0x40
[<ffffffff811b9569>] SyS_write+0x49/0xa0
[<ffffffff810fc2a6>] ? __audit_syscall_exit+0x1f6/0x2a0
[<ffffffff816962e9>] system_call_fastpath+0x16/0x1b
Code: 31 c0 e8 82 db 37 e1 e9 2a ff ff ff 48 8b 07 8b 57 14 48 c7 c7 d5 c6 31 a0 48 8b 70 20 31 c0 e8 65 db 37 e1 e9 f4 fe ff ff 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55
RIP [<ffffffffa0305fa8>] svc_destroy+0x128/0x130 [sunrpc]
RSP <ffff88003f9b9de0>

Evidently, we created some lockd sockets and then failed to create
others. make_socks then returned an error and we tried to tear down the
svc, but svc->sv_permsocks was not empty so we ended up tripping over
the BUG() in svc_destroy().

Fix this by ensuring that we tear down any live sockets we created when
socket creation is going to return an error.

Fixes: 786185b5f8abefa (SUNRPC: move per-net operations from...)
Reported-by: Raphos <raphoszap@laposte.net>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# f170168b 03-Jul-2013 Kees Cook <keescook@chromium.org>

drivers: avoid parsing names as kthread_run() format strings

Calling kthread_run with a single name parameter causes it to be handled
as a format string. Many callers are passing potentially dynamic string
content, so use "%s" in those cases to avoid any potential accidents.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e9406db2 18-Sep-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

lockd: per-net NSM client creation and destruction helpers introduced

NSM RPC client can be required on NFSv3 umount, when child reaper is dying (and
destroying it's mount namespace). It means, that current nsproxy is set to
NULL already, but creation of RPC client requires UTS namespace for gaining
hostname string.
This patch introduces reference counted NFS RPC clients creation and
destruction helpers (similar to RPCBIND RPC clients).

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 5b444cc9 17-Aug-2012 J. Bruce Fields <bfields@redhat.com>

svcrpc: remove handling of unknown errors from svc_recv

svc_recv() returns only -EINTR or -EAGAIN. If we really want to worry
about the case where it has a bug that causes it to return something
else, we could stick a WARN() in svc_recv. But it's silly to require
every caller to have all this boilerplate to handle that case.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 5630f7fa 25-Jul-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: move grace period management from lockd() to per-net functions

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 5ccb0066 25-Jul-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: pass actual network namespace to grace period management functions

Passed network namespace replaced hard-coded init_net

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# db9c4553 25-Jul-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: manage grace list per network namespace

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 08d44a35 25-Jul-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: make lockd manager allocated per network namespace

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 66547b02 25-Jul-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: manage grace period per network namespace

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 8dbf28e4 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: add debug message to start and stop functions

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 3d1221df 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: service start function introduced

This is just a code move, which from my POV makes the code look better.
I.e. now on start we have 3 different stages:
1) Service creation.
2) Service per-net data allocation.
3) Service start.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 7d13ec76 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: move global usage counter manipulation from error path

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 24452239 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: service creation function introduced

This function creates service if it doesn't exist, or increases usage
counter if it does, and returns a pointer to it. The usage counter will
be droppepd by svc_destroy() later in lockd_up().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# dbf9b5d7 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: use existing per-net data function on service creation

This patch also replaces svc_rpcb_setup() with svc_bind().

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 4db77695 25-Apr-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

LockD: pass service to per-net up and down functions

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 786185b5 03-May-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

SUNRPC: move per-net operations from svc_destroy()

The idea is to separate service destruction and per-net operations,
because these are two different things and the mix looks ugly.

Notes:

1) For NFS server this patch looks ugly (sorry for that). But these
place will be rewritten soon during NFSd containerization.

2) LockD per-net counter increase int lockd_up() was moved prior to
make_socks() to make lockd_down_net() call safe in case of error.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 9793f7c8 02-May-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

SUNRPC: new svc_bind() routine introduced

This new routine is responsible for service registration in a specified
network context.

The idea is to separate service creation from per-net operations.

Note also: since registering service with svc_bind() can fail, the
service will be destroyed and during destruction it will try to
unregister itself from rpcbind. In this case unregistration has to be
skipped.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# e3f70ead 29-Mar-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: pass network namespace to creation and destruction routines

v2: dereference of most probably already released nlm_host removed in
nlmclnt_done() and reclaimer().

These routines are called from locks reclaimer() kernel thread. This thread
works in "init_net" network context and currently relays on persence on lockd
thread and it's per-net resources. Thus lockd_up() and lockd_down() can't relay
on current network context. So let's pass corrent one into them.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# de5b8e8e 06-Feb-2012 NeilBrown <neilb@suse.de>

lockd: fix arg parsing for grace_period and timeout.

If you try to set grace_period or timeout via a module parameter
to lockd, and do this on a big-endian machine where

sizeof(int) != sizeof(unsigned long)

it won't work. This number given will be effectively shifted right
by the difference in those two sizes.

So cast kp->arg properly to get correct result.

Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 3b64739f 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: shutdown NLM hosts in network namespace context

Lockd now managed in network namespace context. And this patch introduces
network namespace related NLM hosts shutdown in case of releasing per-net Lockd
resources.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# bb2224df 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: per-net up and down routines introduced

This patch introduces per-net Lockd initialization and destruction routines.
The logic is the same as in global Lockd up and down routines. Probably the
solution is not the best one. But at least it looks clear.
So per-net "up" routine are called only in case of lockd is running already. If
per-net resources are not allocated yet, then service is being registered with
local portmapper and lockd sockets created.
Per-net "down" routine is called on every lockd_down() call in case of global
users counter is not zero.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# a9c5d73a 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: pernet usage counter introduced

Lockd is going to be shared between network namespaces - i.e. going to be able
to handle lock requests from different network namespaces. This means, that
network namespace related resources have to be allocated not once (like now),
but for every network namespace context, from which service is requested to
operate.
This patch implements Lockd per-net users accounting. New per-net counter is
used to determine, when per-net resources have to be freed.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# c228fa20 31-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

Lockd: create permanent lockd sockets in current network namespace

This patch parametrizes Lockd permanent sockets creation routine by network
namespace context.
It also replaces hard-coded init_net with current network namespace context in
Lockd sockets creation routines.
This approach looks safe, because Lockd is created during NFS mount (or NFS
server start) and thus socket is required exactly in current network namespace
context. But in the same time it means, that Lockd sockets inherits first Lockd
requester network namespace. This issue will be fixed in further patches of the
series.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 4cb54ca2 20-Jan-2012 Stanislav Kinsbursky <skinsbursky@parallels.com>

SUNRPC: search for service transports in network namespace context

Service transports are parametrized by network namespace. And thus lookup of
transport instance have to take network namespace into account.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>


# 11fd165c 28-Jul-2011 Eric Dumazet <eric.dumazet@gmail.com>

sunrpc: use better NUMA affinities

Use NUMA aware allocations to reduce latencies and increase throughput.

sunrpc kthreads can use kthread_create_on_node() if pool_mode is
"percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can
also take into account NUMA node affinity for memory allocations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: "J. Bruce Fields" <bfields@fieldses.org>
CC: Neil Brown <neilb@suse.de>
CC: David Miller <davem@davemloft.net>
Reviewed-by: Greg Banks <gnb@fastmail.fm>
[bfields@redhat.com: fix up caller nfs41_callback_up]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 763641d8 26-Oct-2010 Arnd Bergmann <arnd@arndb.de>

lockd: push lock_flocks down

lockd should use lock_flocks() instead of lock_kernel()
to lock against posix locks accessing the i_flock list.

This is a prerequisite to turning lock_flocks into a
spinlock.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: J. Bruce Fields <bfields@redhat.com>


# fc5d00b0 29-Sep-2010 Pavel Emelyanov <xemul@parallels.com>

sunrpc: Add net argument to svc_create_xprt

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>


# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>


# d6783b2b 26-Jan-2010 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Bury "#ifdef IPV6" in svc_create_xprt()

Clean up: Bruce observed we have more or less common logic in each of
svc_create_xprt()'s callers: the check to create an IPv6 RPC listener
socket only if CONFIG_IPV6 is set. I'm about to add another case
that does just the same.

If we move the ifdefs into __svc_xpo_create(), then svc_create_xprt()
call sites can get rid of the "#ifdef" ugliness, and can use the same
logic with or without IPv6 support available in the kernel.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 6d456111 16-Nov-2009 Eric W. Biederman <ebiederm@xmission.com>

sysctl: Drop & in front of every proc_handler.

For consistency drop & in front of every proc_handler. Explicity
taking the address is unnecessary and it prevents optimizations
like stubbing the proc_handlers to NULL.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>


# ab09203e 05-Nov-2009 Eric W. Biederman <ebiederm@xmission.com>

sysctl fs: Remove dead binary sysctl support

Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name
and .strategy members of sysctl tables are dead code. Remove them.

Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>


# 89996df4 06-May-2009 J. Bruce Fields <bfields@citi.umich.edu>

lockd: fix list corruption on lockd restart

If lockd is signalled soon enough after restart then locks_start_grace()
will try to re-add an entry to a list and trigger a lock corruption
warning.

Thanks to Wang Chen for the problem report and diagnosis.

WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c()
...
list_add corruption. next->prev should be prev (ef8fe958), but was ef8ff128. (next=ef8ff128).
...
Pid: 23062, comm: lockd Tainted: G W 2.6.30-rc2 #3
Call Trace:
[<c042d5b5>] warn_slowpath+0x71/0xa0
[<c0422a96>] ? update_curr+0x11d/0x125
[<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150
[<c044b270>] ? trace_hardirqs_on+0xb/0xd
[<c051c61a>] ? _raw_spin_lock+0x53/0xfa
[<c051c89f>] __list_add+0x27/0x5c
[<ef8f6daa>] locks_start_grace+0x22/0x30 [lockd]
[<ef8f34da>] set_grace_period+0x39/0x53 [lockd]
[<c06b8921>] ? lock_kernel+0x1c/0x28
[<ef8f3558>] lockd+0x64/0x164 [lockd]
[<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150
[<c04227b0>] ? complete+0x34/0x3e
[<ef8f34f4>] ? lockd+0x0/0x164 [lockd]
[<ef8f34f4>] ? lockd+0x0/0x164 [lockd]
[<c043dd42>] kthread+0x45/0x6b
[<c043dcfd>] ? kthread+0x0/0x6b
[<c0403c23>] kernel_thread_helper+0x7/0x10

Reported-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: stable@kernel.org


# eb16e907 18-Mar-2009 Chuck Lever <chuck.lever@oracle.com>

lockd: Start PF_INET6 listener only if IPv6 support is available

Apparently a lot of people need to disable IPv6 completely on their
distributor-built systems, which have CONFIG_IPV6_MODULE enabled at
build time.

They do this by blacklisting the ipv6.ko module. This causes the
creation of the lockd service listener to fail if CONFIG_IPV6_MODULE
is set, but the module cannot be loaded.

Now that the kernel's PF_INET6 RPC listeners are completely separate
from PF_INET listeners, we can always start PF_INET. Then lockd can
try to start PF_INET6, but it isn't required to be available.

Note this has the added benefit that NLM callbacks from AF_INET6
servers will never come from AF_INET remotes. We no longer have to
worry about matching mapped IPv4 addresses to AF_INET when comparing
addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 26298caa 18-Mar-2009 Chuck Lever <chuck.lever@oracle.com>

NFS: Revert creation of IPv6 listeners for lockd and NFSv4 callbacks

We're about to convert over to using separate PF_INET and PF_INET6
listeners, instead of a single PF_INET6 listener that also receives
AF_INET requests and maps them to AF_INET6.

Clear the way by removing the logic in lockd and the NFSv4 callback
server that creates an AF_INET6 service listener.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 49a9072f 18-Mar-2009 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Remove @family argument from svc_create() and svc_create_pooled()

Since an RPC service listener's protocol family is specified now via
svc_create_xprt(), it no longer needs to be passed to svc_create() or
svc_create_pooled(). Remove that argument from the synopsis of those
functions, and remove the sv_family field from the svc_serv struct.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 9652ada3 18-Mar-2009 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Change svc_create_xprt() to take a @family argument

The sv_family field is going away. Pass a protocol family argument to
svc_create_xprt() instead of extracting the family from the passed-in
svc_serv struct.

Again, as this is a listener socket and not an address, we make this
new argument an "int" protocol family, instead of an "sa_family_t."

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 0dba7c2a 31-Dec-2008 Chuck Lever <chuck.lever@oracle.com>

NLM: Clean up flow of control in make_socks() function

Clean up: Use Bruce's preferred control flow style in make_socks().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# d3fe5ea7 31-Dec-2008 Chuck Lever <chuck.lever@oracle.com>

NLM: Refactor make_socks() function

Clean up: extract common logic in NLM's make_socks() function
into a helper.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# b064ec03 11-Dec-2008 Chuck Lever <chuck.lever@oracle.com>

lockd: Enable NLM use of AF_INET6

If the kernel is configured to support IPv6 and the RPC server can register
services via rpcbindv4, we are all set to enable IPv6 support for lockd.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aime Le Rouzic <aime.le-rouzic@bull.net>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# b7ba597f 11-Dec-2008 Chuck Lever <chuck.lever@oracle.com>

NSM: Move nsm_use_hostnames to mon.c

Clean up.

Treat the nsm_use_hostnames global variable like nsm_local_state.
Note that the default value of nsm_use_hostnames is still zero.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# e6765b83 11-Dec-2008 Chuck Lever <chuck.lever@oracle.com>

NSM: Remove include/linux/lockd/sm_inter.h

Clean up: The include/linux/lockd/sm_inter.h header is nearly empty
now. Remove it.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# c72a476b 20-Oct-2008 Jeff Layton <jlayton@kernel.org>

lockd: set svc_serv->sv_maxconn to a more reasonable value (try #3)

The default method for calculating the number of connections allowed
per RPC service arbitrarily limits single-threaded services to 80
connections. This is too low for services like lockd and artificially
limits the number of TCP clients that it can support.

Have lockd set a default sv_maxconn value to 1024 (which is the typical
default value for RLIMIT_NOFILE. Also add a module parameter to allow an
admin to set this to an arbitrary value.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 2de59872 23-Dec-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

LOCKD: Make lockd_up() and lockd_down() exported GPL-only

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 2c5e7615 20-Nov-2008 J. Bruce Fields <bfields@citi.umich.edu>

nfsd: clean up grace period on early exit

If nfsd was shut down before the grace period ended, we could end up
with a freed object still on grace_list. Thanks to Jeff Moyer for
reporting the resulting list corruption warnings.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tested-by: Jeff Moyer <jmoyer@redhat.com>


# 26a41409 03-Oct-2008 Chuck Lever <chuck.lever@oracle.com>

NLM: Remove "proto" argument from lockd_up()

Clean up: Now that lockd_up() starts listeners for both transports, the
"proto" argument is no longer needed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 8c3916f4 03-Oct-2008 Chuck Lever <chuck.lever@oracle.com>

NLM: Always start both UDP and TCP listeners

Commit 24e36663, which first appeared in 2.6.19, changed lockd so that
the client side starts a UDP listener only if there is a UDP NFSv2/v3
mount. Its description notes:

This... means that lockd will *not* listen on UDP if the only
mounts are TCP mount (and nfsd hasn't started).

The latter is the only one that concerns me at all - I don't know
if this might be a problem with some servers.

Unfortunately it is a problem for Linux itself. The rpc.statd daemon
on Linux uses UDP for contacting the local lockd, no matter which
protocol is used for NFS mounts. Without a local lockd UDP listener,
NFSv2/v3 lock recovery from Linux NFS clients always fails.

Revert parts of commit 24e36663 so lockd_up() always starts both
listeners.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# af558e33 05-Sep-2007 J. Bruce Fields <bfields@citi.umich.edu>

nfsd: common grace period control

Rewrite grace period code to unify management of grace period across
lockd and nfsd. The current code has lockd and nfsd cooperate to
compute a grace period which is satisfactory to them both, and then
individually enforce it. This creates a slight race condition, since
the enforcement is not coordinated. It's also more complicated than
necessary.

Here instead we have lockd and nfsd each inform common code when they
enter the grace period, and when they're ready to leave the grace
period, and allow normal locking only after both of them are ready to
leave.

We also expect the locks_start_grace()/locks_end_grace() interface here
to be simpler to build on for future cluster/high-availability work,
which may require (for example) putting individual filesystems into
grace, or enforcing grace periods across multiple cluster nodes.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# c8ab5f2a 18-Mar-2008 J. Bruce Fields <bfields@citi.umich.edu>

lockd: don't depend on lockd main loop to end grace

End lockd's grace period using schedule_delayed_work() instead of a
check on every pass through the main loop.

After a later patch, we'll depend on lockd to end its grace period even
if it's not currently handling requests; so it shouldn't depend on being
woken up from the main loop to do so.

Also, Nakano Hiroaki (who independently produced a similar patch)
noticed that the current behavior is buggy in the face of jiffies
wraparound:

"lockd uses time_before() to determine whether the grace period
has expired. This would seem to be enough to avoid timer
wrap-around issues, but, unfortunately, that is not the case.
The time_* family of comparison functions can be safely used to
compare jiffies relatively close in time, but they stop working
after approximately LONG_MAX/2 ticks. nfsd can suffer this
problem because the time_before() comparison in lockd() is not
performed until the first request comes in, which means that if
there is no lockd traffic for more than LONG_MAX/2 ticks we are
screwed.

"The implication of this is that once time_before() starts
misbehaving any attempt from a NFS client to execute fcntl()
will be received with a NLM_LCK_DENIED_GRACE_PERIOD message for
25 days (assuming HZ=1000). In other words, the 50 seconds grace
period could turn into a grace period of 50 days or more.

"Note: This bug was analyzed independently by Oda-san
<oda@valinux.co.jp> and myself."

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Nakano Hiroaki <nakano.hiroaki@oss.ntt.co.jp>
Cc: Itsuro Oda <oda@valinux.co.jp>


# 8fafa900 24-Jan-2008 J. Bruce Fields <bfields@citi.umich.edu>

locks: allow lockd to process blocked locks during grace period

The check here is currently harmless but unnecessary, since, as the
comment notes, there aren't any blocked-lock callbacks to process
during the grace period anyway.

And eventually we want to allow multiple grace periods that come and go
for different filesystems over the course of the lifetime of lockd, at
which point this check is just going to get in the way.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# e851db5b 30-Jun-2008 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Add address family field to svc_serv data structure

Introduce and initialize an address family field in the svc_serv structure.

This field will determine what family to use for the service's listener
sockets and what families are advertised via the local rpcbind daemon.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# abd1ec4e 11-Jun-2008 Jeff Layton <jlayton@kernel.org>

lockd: close potential race with rapid lockd_up/lockd_down cycle

If lockd_down is called very rapidly after lockd_up returns, then
there is a slim chance that lockd() will never be called. kthread()
will return before calling the function, so we'll end up never
actually calling the cleanup functions for the thread.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# f97c650d 08-Apr-2008 Jeff Layton <jlayton@kernel.org>

NLM: don't let lockd exit on unexpected svc_recv errors (try #2)

When svc_recv returns an unexpected error, lockd will print a warning
and exit. This problematic for several reasons. In particular, it will
cause the reference counts for the thread to be wrong, and can lead to a
potential BUG() call.

Rather than exiting on error from svc_recv, have the thread do a 1s
sleep and then retry the loop. This is unlikely to cause any harm, and
if the error turns out to be something temporary then it may be able to
recover.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# d751a7cd 07-Feb-2008 Jeff Layton <jlayton@kernel.org>

NLM: Convert lockd to use kthreads

Have lockd_up start lockd using kthread_run. With this change,
lockd_down now blocks until lockd actually exits, so there's no longer
need for the waitqueue code at the end of lockd_down. This also means
that only one lockd can be running at a time which simplifies the code
within lockd's main loop.

This also adds a check for kthread_should_stop in the main loop of
nlmsvc_retry_blocked and after that function returns. There's no sense
continuing to retry blocks if lockd is coming down anyway.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 90d5b180 14-Mar-2008 Chuck Lever <chuck.lever@oracle.com>

NLM: LOCKD fails to load if CONFIG_SYSCTL is not set

Bruce Fields says:
"By the way, we've got another config-related nit here:

http://bugzilla.linux-nfs.org/show_bug.cgi?id=156

You can build lockd without CONFIG_SYSCTL set, but then the module will
fail to load."

For now, disable the sysctl registration calls in lockd if CONFIG_SYSCTL
is not enabled. This allows the kernel to build properly if PROC_FS or
SYSCTL is not enabled, but an NFS client is desired.

In the long run, we would like to be able to build the kernel with an
NFS client but without lockd. This makes sense, for example, if you want
an NFSv4-only NFS client, as NFSv4 doesn't use NLM at all.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 5216a8e7 21-Feb-2008 Pavel Emelyanov <xemul@openvz.org>

Wrap buffers used for rpc debug printks into RPC_IFDEBUG

Sorry for the noise, but here's the v3 of this compilation fix :)

There are some places, which declare the char buf[...] on the stack
to push it later into dprintk(). Since the dprintk sometimes (if the
CONFIG_SYSCTL=n) becomes an empty do { } while (0) stub, these buffers
cause gcc to produce appropriate warnings.

Wrap these buffers with RPC_IFDEBUG macro, as Trond proposed, to
compile them out when not needed.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# a217813f 30-Dec-2007 Tom Tucker <tom@opengridcomputing.com>

knfsd: Support adding transports by writing portlist file

Update the write handler for the portlist file to allow creating new
listening endpoints on a transport. The general form of the string is:

<transport_name><space><port number>

For example:

echo "tcp 2049" > /proc/fs/nfsd/portlist

This is intended to support the creation of a listening endpoint for
RDMA transports without adding #ifdef code to the nfssvc.c file.

Transports can also be removed as follows:

'-'<transport_name><space><port number>

For example:

echo "-tcp 2049" > /proc/fs/nfsd/portlist

Attempting to add a listener with an invalid transport string results
in EPROTONOSUPPORT and a perror string of "Protocol not supported".

Attempting to remove an non-existent listener (.e.g. bad proto or port)
results in ENOTCONN and a perror string of
"Transport endpoint is not connected"

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 7fcb98d5 30-Dec-2007 Tom Tucker <tom@opengridcomputing.com>

svc: Add svc API that queries for a transport instance

Add a new svc function that allows a service to query whether a
transport instance has already been created. This is used in lockd
to determine whether or not a transport needs to be created when
a lockd instance is brought up.

Specifying 0 for the address family or port is effectively a wild-card,
and will result in matching the first transport in the service's list
that has a matching class name.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 7a182083 30-Dec-2007 Tom Tucker <tom@opengridcomputing.com>

svc: Make close transport independent

Move sk_list and sk_ready to svc_xprt. This involves close because these
lists are walked by svcs when closing all their transports. So I combined
the moving of these lists to svc_xprt with making close transport independent.

The svc_force_sock_close has been changed to svc_close_all and takes a list
as an argument. This removes some svc internals knowledge from the svcs.

This code races with module removal and transport addition.

Thanks to Simon Holm Thøgersen for a compile fix.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Cc: Simon Holm Thøgersen <odie@cs.aau.dk>


# d7c9f1ed 30-Dec-2007 Tom Tucker <tom@opengridcomputing.com>

svc: Change services to use new svc_create_xprt service

Modify the various kernel RPC svcs to use the svc_create_xprt service.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Acked-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 9a8db97e 17-Jul-2007 Marc Eshel <eshel@almaden.ibm.com>

knfsd: lockd: nfsd4: use same grace period for lockd and nfsd4

Both lockd and (in the nfsv4 case) nfsd enforce a "grace period" after reboot,
during which clients may reclaim locks from the previous server instance, but
may not acquire new locks.

Currently the lockd and nfsd enforce grace periods of different lengths. This
may cause problems when we reboot a server with both v2/v3 and v4 clients.
For example, if the lockd grace period is shorter (as is likely the case),
then a v3 client might acquire a new lock that conflicts with a lock already
held (but not yet reclaimed) by a v4 client.

This patch calculates a lease time that lockd and nfsd can both use.

Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 83144186 17-Jul-2007 Rafael J. Wysocki <rjw@rjwysocki.net>

Freezer: make kernel threads nonfreezable by default

Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves. This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.

It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.

The patch causes all kernel threads to be nonfreezable by default (ie. to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE. It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear. Additionally, it updates documentation to
describe the freezing of tasks more accurately.

[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# f61534df 14-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Remove redundant calls to rpciod_up()/rpciod_down()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 405ae7d3 17-Feb-2007 Robert P. J. Day <rpjday@mindspring.com>

Replace remaining references to "driverfs" with "sysfs".

Globally, s/driverfs/sysfs/g.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>


# 0b4d4147 14-Feb-2007 Eric W. Biederman <ebiederm@xmission.com>

[PATCH] sysctl: remove insert_at_head from register_sysctl

The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name. Which is
pain for caching and the proc interface never implemented.

I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.

So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Corey Minyard <minyard@acm.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# ad06e4bd 12-Feb-2007 Chuck Lever <chuck.lever@oracle.com>

[PATCH] knfsd: SUNRPC: Add a function to format the address in an svc_rqst for printing

There are loads of places where the RPC server assumes that the rq_addr fields
contains an IPv4 address. Top among these are error and debugging messages
that display the server's IP address.

Let's refactor the address printing into a separate function that's smart
enough to figure out the difference between IPv4 and IPv6 addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 482fb94e 12-Feb-2007 Chuck Lever <chuck.lever@oracle.com>

[PATCH] knfsd: SUNRPC: allow creating an RPC service without registering with portmapper

Sometimes we need to create an RPC service but not register it with the local
portmapper. NFSv4 delegation callback, for example.

Change the svc_makesock() API to allow optionally creating temporary or
permanent sockets, optionally registering with the local portmapper, and make
it return the ephemeral port of the new socket.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Aurelien Charbon <aurelien.charbon@ext.bull.net>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 7cc13edc 06-Nov-2006 Eric W. Biederman <ebiederm@xmission.com>

[PATCH] sysctl: implement CTL_UNNUMBERED

This patch takes the CTL_UNNUMBERD concept from NFS and makes it available to
all new sysctl users.

At the same time the sysctl binary interface maintenance documentation is
updated to mention and to describe what is needed to successfully maintain the
sysctl binary interface.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 460f5cac 04-Oct-2006 Olaf Kirch <okir@suse.de>

[PATCH] knfsd: export nsm_local_state to user space via sysctl

Every NLM call includes the client's NSM state. Currently, the Linux client
always reports 0 - which seems not to cause any problems, but is not what the
protocol says.

This patch exposes the kernel's internal variable to user space via a sysctl,
which can be set at system boot time by statd.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# abd1f500 04-Oct-2006 Olaf Kirch <okir@suse.de>

[PATCH] knfsd: lockd: optionally use hostnames for identifying peers

This patch adds the nsm_use_hostnames sysctl and module param. If set, lockd
will use the client's name (as given in the NLM arguments) to find the NSM
handle. This makes recovery work when the NFS peer is multi-homed, and the
reboot notification arrives from a different IP than the original lock calls.

Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 4a3ae42d 02-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: Correctly handle error condition from lockd_up

If lockd_up fails - what should we expect? Do we have to later call
lockd_down?

Well the nfs client thinks "no", the nfs server thinks "yes". lockd thinks
"yes".

The only answer that really makes sense is "no" !!

So:
Make lockd_up only increment nlmsvc_users on success.
Make nfsd handle errors from lockd_up properly.
Make sure lockd_up(0) never fails when lockd is running
so that the 'reclaimer' call to lockd_up doesn't need to
be error checked.

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 7dcf91ec 02-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: Move makesock failed warning into make_socks.

Thus it is printed for any path that leads to failure (make_socks is called
from two places).

Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 6fb2b47f 02-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: Drop 'serv' option to svc_recv and svc_process

It isn't needed as it is available in rqstp->rq_server, and dropping it allows
some local vars to be dropped.

[akpm@osdl.org: build fix]
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 24e36663 02-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: be more selective in which sockets lockd listens on

Currently lockd listens on UDP always, and TCP if CONFIG_NFSD_TCP is set.

However as lockd performs services of the client as well, this is a problem.
If CONFIG_NfSD_TCP is not set, and a tcp mount is used, the server will not be
able to call back to lockd.

So:
- add an option to lockd_up saying which protocol is needed
- Always open sockets for which an explicit port was given, otherwise
only open a socket of the type required
- Change nfsd to do one lockd_up per socket rather than one per thread.

This
- removes the dependancy on CONFIG_NFSD_TCP
- means that lockd may open sockets other than at startup
- means that lockd will *not* listen on UDP if the only
mounts are TCP mount (and nfsd hasn't started).

The latter is the only one that concerns me at all - I don't know if this
might be a problem with some servers.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# bc591ccf 02-Oct-2006 NeilBrown <neilb@suse.de>

[PATCH] knfsd: add a callback for when last rpc thread finishes

nfsd has some cleanup that it wants to do when the last thread exits, and
there will shortly be some more. So collect this all into one place and
define a callback for an rpc service to call when the service is about to be
destroyed.

[akpm@osdl.org: cleanups, build fix]
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 6ab3d562 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de>

Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>


# 353ab6e9 26-Mar-2006 Ingo Molnar <mingo@elte.hu>

[PATCH] sem2mutex: fs/

Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org>
Cc: Robert Love <rml@tech9.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# e8c96f8c 24-Mar-2006 Tobias Klauser <tklauser@nuerscht.ch>

[PATCH] fs: Use ARRAY_SIZE macro

Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a
duplicate of ARRAY_SIZE. Some trailing whitespaces are also deleted.

Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch>
Cc: David Howells <dhowells@redhat.com>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Chris Mason <mason@suse.com>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 2bd61579 03-Jan-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Ensure that SIGKILL will always terminate a synchronous RPC call.

...and make sure that the "intr" flag also enables SIGHUP and SIGTERM to
interrupt RPC calls too (as per the Solaris implementation).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 7ee91ec1 13-Jul-2005 Steve Dickson <SteveD@redhat.com>

[PATCH] NFS: procfs/sysctl interfaces for lockd do not work on x86_64

Allow the setting of NLM timeouts and grace periods through the proc and
sysclt interfaces on x86_64 architectures

Signed-off-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 46be925f 23-Jun-2005 NeilBrown <neilb@cse.unsw.edu.au>

[PATCH] knfsd: lockd: flush signals on shutdown

Silence another annoying "failed to contact portmap (errno -512)" on shutdown.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 1da177e4 16-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org>

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!