History log of /linux-master/net/sunrpc/sched.c
Revision Date Author Comments
# d180891f 09-May-2023 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Don't change task->tk_status after the call to rpc_exit_task

Some calls to rpc_exit_task() may deliberately change the value of
task->tk_status, for instance because it gets checked by the RPC call's
rpc_release() callback. That makes it wrong to reset the value to
task->tk_rpc_status.
In particular this causes a bug where the rpc_call_done() callback tries
to fail over a set of pNFS/flexfiles writes to a different IP address,
but the reset of task->tk_status causes nfs_commit_release_pages() to
immediately mark the file as having a fatal error.

Fixes: 39494194f93b ("SUNRPC: Fix races with rpc_killall_tasks()")
Cc: stable@vger.kernel.org # 6.1.x
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 691d0b78 18-Apr-2023 Dai Ngo <dai.ngo@oracle.com>

SUNRPC: remove the maximum number of retries in call_bind_status

Currently call_bind_status places a hard limit of 3 to the number of
retries on EACCES error. This limit was done to prevent NLM unlock
requests from being hang forever when the server keeps returning garbage.
However this change causes problem for cases when NLM service takes
longer than 9 seconds to register with the port mapper after a restart.

This patch removes this hard coded limit and let the RPC handles
the retry based on the standard hard/soft task semantics.

Fixes: 0b760113a3a1 ("NLM: Don't hang forever on NLM unlock requests")
Reported-by: Helen Chao <helen.chao@oracle.com>
Tested-by: Helen Chao <helen.chao@oracle.com>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# f8423909 05-Oct-2022 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Add a helper to allow pNFS drivers to selectively cancel RPC calls

Add the helper rpc_cancel_tasks(), which uses a caller-defined selection
function to define a set of in-flight RPC calls to cancel. This is
mainly intended for pNFS drivers which are subject to a layout recall,
and which may therefore want to cancel all pending I/O using that layout
in order to redrive it after the layout recall has been satisfied.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 39494194 05-Oct-2022 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Fix races with rpc_killall_tasks()

Ensure that we immediately call rpc_exit_task() after waking up, and
that the tk_rpc_status cannot get clobbered by some other function.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# f5d39b02 22-Aug-2022 Peter Zijlstra <peterz@infradead.org>

freezer,sched: Rewrite core freezer logic

Rewrite the core freezer to behave better wrt thawing and be simpler
in general.

By replacing PF_FROZEN with TASK_FROZEN, a special block state, it is
ensured frozen tasks stay frozen until thawed and don't randomly wake
up early, as is currently possible.

As such, it does away with PF_FROZEN and PF_FREEZER_SKIP, freeing up
two PF_flags (yay!).

Specifically; the current scheme works a little like:

freezer_do_not_count();
schedule();
freezer_count();

And either the task is blocked, or it lands in try_to_freezer()
through freezer_count(). Now, when it is blocked, the freezer
considers it frozen and continues.

However, on thawing, once pm_freezing is cleared, freezer_count()
stops working, and any random/spurious wakeup will let a task run
before its time.

That is, thawing tries to thaw things in explicit order; kernel
threads and workqueues before doing bringing SMP back before userspace
etc.. However due to the above mentioned races it is entirely possible
for userspace tasks to thaw (by accident) before SMP is back.

This can be a fatal problem in asymmetric ISA architectures (eg ARMv9)
where the userspace task requires a special CPU to run.

As said; replace this with a special task state TASK_FROZEN and add
the following state transitions:

TASK_FREEZABLE -> TASK_FROZEN
__TASK_STOPPED -> TASK_FROZEN
__TASK_TRACED -> TASK_FROZEN

The new TASK_FREEZABLE can be set on any state part of TASK_NORMAL
(IOW. TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE) -- any such state
is already required to deal with spurious wakeups and the freezer
causes one such when thawing the task (since the original state is
lost).

The special __TASK_{STOPPED,TRACED} states *can* be restored since
their canonical state is in ->jobctl.

With this, frozen tasks need an explicit TASK_FROZEN wakeup and are
free of undue (early / spurious) wakeups.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220822114649.055452969@infradead.org


# 4b8dbdfb 28-Apr-2022 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Fix an RPC/RDMA performance regression

Use the standard gfp mask instead of using GFP_NOWAIT. The latter causes
issues when under memory pressure.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 25cf32ad 06-Apr-2022 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Handle allocation failure in rpc_new_task()

If the call to rpc_alloc_task() fails, then ensure that the calldata is
released, and that rpc_run_task() and rpc_run_bc_task() bail out early.

Reported-by: NeilBrown <neilb@suse.de>
Fixes: 910ad38697d9 ("NFS: Fix memory allocation in rpc_alloc_task()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 910ad386 21-Mar-2022 Trond Myklebust <trond.myklebust@hammerspace.com>

NFS: Fix memory allocation in rpc_alloc_task()

As for rpc_malloc(), we first try allocating from the slab, then fall
back to a non-waiting allocation from the mempool.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 33e5c765 14-Mar-2022 Trond Myklebust <trond.myklebust@hammerspace.com>

NFS: Fix memory allocation in rpc_malloc()

When in a low memory situation, we do want rpciod to kick off direct
reclaim in the case where that helps, however we don't want it looping
forever in mempool_alloc().
So first try allocating from the slab using GFP_KERNEL | __GFP_NORETRY,
and then fall back to a GFP_NOWAIT allocation from the mempool.

Ditto for rpc_alloc_task()

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 8db55a03 06-Mar-2022 NeilBrown <neilb@suse.de>

SUNRPC: improve 'swap' handling: scheduling and PF_MEMALLOC

rpc tasks can be marked as RPC_TASK_SWAPPER. This causes GFP_MEMALLOC
to be used for some allocations. This is needed in some cases, but not
in all where it is currently provided, and in some where it isn't
provided.

Currently *all* tasks associated with a rpc_client on which swap is
enabled get the flag and hence some GFP_MEMALLOC support.

GFP_MEMALLOC is provided for ->buf_alloc() but only swap-writes need it.
However xdr_alloc_bvec does not get GFP_MEMALLOC - though it often does
need it.

xdr_alloc_bvec is called while the XPRT_LOCK is held. If this blocks,
then it blocks all other queued tasks. So this allocation needs
GFP_MEMALLOC for *all* requests, not just writes, when the xprt is used
for any swap writes.

Similarly, if the transport is not connected, that will block all
requests including swap writes, so memory allocations should get
GFP_MEMALLOC if swap writes are possible.

So with this patch:
1/ we ONLY set RPC_TASK_SWAPPER for swap writes.
2/ __rpc_execute() sets PF_MEMALLOC while handling any task
with RPC_TASK_SWAPPER set, or when handling any task that
holds the XPRT_LOCKED lock on an xprt used for swap.
This removes the need for the RPC_IS_SWAPPER() test
in ->buf_alloc handlers.
3/ xprt_prepare_transmit() sets PF_MEMALLOC after locking
any task to a swapper xprt. __rpc_execute() will clear it.
3/ PF_MEMALLOC is set for all the connect workers.

Reviewed-by: Chuck Lever <chuck.lever@oracle.com> (for xprtrdma parts)
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# a80a8461 06-Mar-2022 NeilBrown <neilb@suse.de>

SUNRPC: remove scheduling boost for "SWAPPER" tasks.

Currently, tasks marked as "swapper" tasks get put to the front of
non-priority rpc_queues, and are sorted earlier than non-swapper tasks on
the transport's ->xmit_queue.

This is pointless as currently *all* tasks for a mount that has swap
enabled on *any* file are marked as "swapper" tasks. So the net result
is that the non-priority rpc_queues are reverse-ordered (LIFO).

This scheduling boost is not necessary to avoid deadlocks, and hurts
fairness, so remove it. If there were a need to expedite some requests,
the tk_priority mechanism is a more appropriate tool.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# c487216b 06-Mar-2022 NeilBrown <neilb@suse.de>

SUNRPC/call_alloc: async tasks mustn't block waiting for memory

When memory is short, new worker threads cannot be created and we depend
on the minimum one rpciod thread to be able to handle everything.
So it must not block waiting for memory.

mempools are particularly a problem as memory can only be released back
to the mempool by an async rpc task running. If all available
workqueue threads are waiting on the mempool, no thread is available to
return anything.

rpc_malloc() can block, and this might cause deadlocks.
So check RPC_IS_ASYNC(), rather than RPC_IS_SWAPPER() to determine if
blocking is acceptable.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 0adc8794 28-Jan-2022 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Convert GFP_NOFS to GFP_KERNEL

The sections which should not re-enter the filesystem are already
protected with memalloc_nofs_save/restore calls, so it is better to use
GFP_KERNEL in these calls to allow better performance for synchronous
RPC calls.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# b40887e1 16-Oct-2021 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Trace calls to .rpc_call_done

Introduce a single tracepoint that can replace simple dprintk call
sites in upper layer "rpc_call_done" callbacks. Example:

kworker/u24:2-1254 [001] 771.026677: rpc_stats_latency: task:00000001@00000002 xid=0x16a6f3c0 rpcbindv2 GETPORT backlog=446 rtt=101 execute=555
kworker/u24:2-1254 [001] 771.026677: rpc_task_call_done: task:00000001@00000002 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpcb_getport_done
kworker/u24:2-1254 [001] 771.026678: rpcb_setport: task:00000001@00000002 status=0 port=20048

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 0392dd51 04-Oct-2021 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Per-rpc_clnt task PIDs

The current range of RPC task PIDs is 0..65535. That's not adequate
for distinguishing tasks across multiple rpc_clnts running high
throughput workloads.

To help relieve this situation and to reduce the bottleneck of
having a single atomic for assigning all RPC task PIDs, assign task
PIDs per rpc_clnt.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 6dbcbe3f 12-Jul-2021 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Remove WQ_HIGHPRI from xprtiod

Don't let xprtiod pre-empt softirq.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 47dd8796 12-Jul-2021 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Add cond_resched() at the appropriate point in __rpc_execute()

Allow tasks that need to pre-empt rpciod/xprtiod to do so when it is
safe.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 5483b904 26-Jun-2021 Zhang Xiaoxu <zhangxiaoxu5@huawei.com>

SUNRPC: Should wake up the privileged task firstly.

When find a task from wait queue to wake up, a non-privileged task may
be found out, rather than the privileged. This maybe lead a deadlock
same as commit dfe1fe75e00e ("NFSv4: Fix deadlock between nfs4_evict_inode()
and nfs4_opendata_get_inode()"):

Privileged delegreturn task is queued to privileged list because all
the slots are assigned. If there has no enough slot to wake up the
non-privileged batch tasks(session less than 8 slot), then the privileged
delegreturn task maybe lost waked up because the found out task can't
get slot since the session is on draining.

So we should treate the privileged task as the emergency task, and
execute it as for as we can.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 5fcdfacc01f3 ("NFSv4: Return delegations synchronously in evict_inode")
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# fcb170a9 26-Jun-2021 Zhang Xiaoxu <zhangxiaoxu5@huawei.com>

SUNRPC: Fix the batch tasks count wraparound.

The 'queue->nr' will wraparound from 0 to 255 when only current
priority queue has tasks. This maybe lead a deadlock same as commit
dfe1fe75e00e ("NFSv4: Fix deadlock between nfs4_evict_inode()
and nfs4_opendata_get_inode()"):

Privileged delegreturn task is queued to privileged list because all
the slots are assigned. When non-privileged task complete and release
the slot, a non-privileged maybe picked out. It maybe allocate slot
failed when the session on draining.

If the 'queue->nr' has wraparound to 255, and no enough slot to
service it, then the privileged delegreturn will lost to wake up.

So we should avoid the wraparound on 'queue->nr'.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 5fcdfacc01f3 ("NFSv4: Return delegations synchronously in evict_inode")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# f0940f4b 03-Mar-2021 Benjamin Coddington <bcodding@redhat.com>

SUNRPC: Set memalloc_nofs_save() for sync tasks

We could recurse into NFS doing memory reclaim while sending a sync task,
which might result in a deadlock. Set memalloc_nofs_save for sync task
execution.

Fixes: a1231fda7e94 ("SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobs")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# e4c72201 22-Oct-2020 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: rpc_wake_up() should wake up tasks in the correct order

Currently, we wake up the tasks by priority queue ordering, which means
that we ignore the batching that is supposed to help with QoS issues.

Fixes: c049f8ea9a0d ("SUNRPC: Remove the bh-safe lock requirement on the rpc_wait_queue->lock")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 5589cc47 08-Jul-2020 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Remove remaining dprintks from sched.c

Clean up.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 721a1d38 08-Jul-2020 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Remove dprintk call sites in RPC queuing functions

Remove redundant call sites or call sites that are already covered
by tracepoints.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 1466c221 08-Jul-2020 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Clean up RPC scheduler tracepoints

Remove several redundant dprintk call sites, and replace a couple of
potentially useful ones with tracepoints.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 78069487 08-Jul-2020 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Remove debugging instrumentation from xprt_release

These instruments don't appear to add any substantial value.

We already have this at the termination of each RPC:

iozone-2617 [002] 975.713126: rpc_stats_latency: task:418@5 xid=0x260eab5d nfsv3 LOOKUP backlog=15 rtt=32 execute=58
iozone-2617 [002] 975.713127: xprt_release_cong: task:418@5 snd_task:4294967295 cong=256 cwnd=16384
iozone-2617 [002] 975.713127: xprt_put_cong: task:418@5 snd_task:4294967295 cong=0 cwnd=16384

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 06e234c6 08-Jul-2020 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Hoist trace_xprtrdma_op_allocate into generic code

Introduce a tracepoint in call_allocate that reports the exact
sizes in the RPC buffer allocation request and the status of the
result. This helps catch problems with XDR buffer provisioning,
and replaces transport-specific debugging instrumentation.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 1fab7dc4 04-Apr-2020 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Don't start a timer on an already queued rpc task

Move the test for whether a task is already queued to prevent
corruption of the timer list in __rpc_sleep_on_priority_timeout().

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 7eac5264 07-Feb-2020 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Add a flag to avoid reference counts on credentials

Add a flag to signal to the RPC layer that the credential is already
pinned for the duration of the RPC call.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# abf8af78 23-Dec-2019 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Capture signalled RPC tasks

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# a264abad 20-Nov-2019 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Capture completion of all RPC tasks

RPC tasks on the backchannel never invoke xprt_complete_rqst(), so
there is no way to report their tk_status at completion. Also, any
RPC task that exits via rpc_exit_task() before it is replied to will
also disappear without a trace.

Introduce a trace point that is symmetrical with rpc_task_begin that
captures the termination status of each RPC task.

Sample trace output for callback requests initiated on the server:
kworker/u8:12-448 [003] 127.025240: rpc_task_end: task:50@3 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpc_exit_task
kworker/u8:12-448 [002] 127.567310: rpc_task_end: task:51@3 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpc_exit_task
kworker/u8:12-448 [001] 130.506817: rpc_task_end: task:52@3 flags=ASYNC|DYNAMIC|SOFT|SOFTCONN|SENT runstate=RUNNING|ACTIVE status=0 action=rpc_exit_task

Odd, though, that I never see trace_rpc_task_complete, either in the
forward or backchannel. Should it be removed?

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 66eb3add 05-Nov-2019 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Avoid RPC delays when exiting suspend

Jon Hunter: "I have been tracking down another suspend/NFS related
issue where again I am seeing random delays exiting suspend. The delays
can be up to a couple minutes in the worst case and this is causing a
suspend test we have to fail."

Change the use of a deferrable work to a standard delayed one.

Reported-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Fixes: 7e0a0e38fcfea ("SUNRPC: Replace the queue timer with a delayed work function")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 714fbc73 12-Sep-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: RPC level errors should always set task->tk_rpc_status

Ensure that we set task->tk_rpc_status for all RPC level errors so that
the caller can distinguish between those and server reply status errors.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 691b45dd 19-Aug-2019 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Remove rpc_wake_up_queued_task_on_wq()

Clean up: commit c544577daddb ("SUNRPC: Clean up transport write
space handling") appears to have removed the last caller of
rpc_wake_up_queued_task_on_wq().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# a101b043 11-Jul-2019 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Fix transport accounting when caller specifies an rpc_xprt

Ensure that we do the required accounting for the round robin queue
when the caller to rpc_init_task() has passed in a transport to be
used.

Reported-by: Olga Kornievskaia <aglo@umich.edu>
Reported-by: Neil Brown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 675dd90a 19-Jun-2019 Chuck Lever <chuck.lever@oracle.com>

xprtrdma: Modernize ops->connect

Adapt and apply changes that were made to the TCP socket connect
code. See the following commits for details on the purpose of
these changes:

Commit 7196dbb02ea0 ("SUNRPC: Allow changing of the TCP timeout parameters on the fly")
Commit 3851f1cdb2b8 ("SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout")
Commit 02910177aede ("SUNRPC: Fix reconnection timeouts")

Some common transport code is moved to xprt.c to satisfy the code
duplication police.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 9dfe52a9 23-May-2019 Dave Wysochanski <dwysocha@redhat.com>

SUNRPC: Move call to rpc_count_iostats before rpc_call_done

For diagnostic purposes, it would be useful to have an rpc_iostats
metric of RPCs completing with tk_status < 0. Unfortunately,
tk_status is reset inside the rpc_call_done functions for each
operation, and the call to tally the per-op metrics comes after
rpc_call_done. Refactor the call to rpc_count_iostat earlier in
rpc_exit_task so we can count these RPCs completing in error.

Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# c049f8ea 02-May-2019 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Remove the bh-safe lock requirement on the rpc_wait_queue->lock

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 7e0a0e38 01-May-2019 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Replace the queue timer with a delayed work function

The queue timer function, which walks the RPC queue in order to locate
candidates for waking up is one of the current constraints against
removing the bh-safe queue spin locks. Replace it with a delayed
work queue, so that we can do the actual rpc task wake ups from an
ordinary process context.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 502980e8 18-Jun-2019 Anna Schumaker <Anna.Schumaker@Netapp.com>

Revert "SUNRPC: Declare RPC timers as TIMER_DEFERRABLE"

Jon Hunter reports:
"I have been noticing intermittent failures with a system suspend test on
some of our machines that have a NFS mounted root file-system. Bisecting
this issue points to your commit 431235818bc3 ("SUNRPC: Declare RPC
timers as TIMER_DEFERRABLE") and reverting this on top of v5.2-rc3 does
appear to resolve the problem.

The cause of the suspend failure appears to be a long delay observed
sometimes when resuming from suspend, and this is causing our test to
timeout."

This reverts commit 431235818bc3a919ca7487500c67c3144feece80.

Reported-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 457c8996 19-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Add SPDX license identifier for missed files

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 43123581 07-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Declare RPC timers as TIMER_DEFERRABLE

Don't wake idle CPUs only for the purpose of servicing an RPC
queue timeout.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 24a9d9a2 07-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Simplify queue timeouts using timer_reduce()

Simplify the setting of queue timeouts by using the timer_reduce()
function.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 5efd1876 07-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Fix up tracking of timeouts

Add a helper to ensure that debugfs and friends print out the
correct current task timeout value.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 6b2e6856 07-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Add function rpc_sleep_on_timeout()

Clean up the RPC task sleep interfaces by replacing the task->tk_timeout
'hidden parameter' to rpc_sleep_on() with a new function that takes an
absolute timeout.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 8357a9b6 07-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Remove unused argument 'action' from rpc_sleep_on_priority()

None of the callers set the 'action' argument, so let's just remove it.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 87150aae 07-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Refactor rpc_sleep_on()

rpc_sleep_on() does not need to set the task->tk_callback under the
queue lock, so move that out.
Also refactor the check for whether the task is active.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# ae67bd38 07-Apr-2019 Trond Myklebust <trondmy@gmail.com>

SUNRPC: Fix up task signalling

The RPC_TASK_KILLED flag should really not be set from another context
because it can clobber data in the struct task when task->tk_flags is
changed non-atomically.
Let's therefore swap out RPC_TASK_KILLED with an atomic flag, and add
a function to set that flag and safely wake up the task.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 6b5f5900 09-Mar-2019 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Remove redundant calls to RPC_IS_QUEUED()

The RPC task wakeup calls all check for RPC_IS_QUEUED() before taking any
locks. In addition, rpc_exit() already calls rpc_wake_up_queued_task().

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 12a3ad61 02-Mar-2019 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Convert remaining GFP_NOIO, and GFP_NOWAIT sites in sunrpc

Convert the remaining gfp_flags arguments in sunrpc to standard reclaiming
allocations, now that we set memalloc_nofs_save() as appropriate.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# a1231fda 18-Feb-2019 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobs

Set memalloc_nofs_save() on all the rpciod/xprtiod jobs so that we
ensure memory allocations for asynchronous rpc calls don't ever end
up recursing back to the NFS layer for memory reclaim.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# a52458b4 02-Dec-2018 NeilBrown <neilb@suse.com>

NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.

SUNRPC has two sorts of credentials, both of which appear as
"struct rpc_cred".
There are "generic credentials" which are supplied by clients
such as NFS and passed in 'struct rpc_message' to indicate
which user should be used to authorize the request, and there
are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS
which describe the credential to be sent over the wires.

This patch replaces all the generic credentials by 'struct cred'
pointers - the credential structure used throughout Linux.

For machine credentials, there is a special 'struct cred *' pointer
which is statically allocated and recognized where needed as
having a special meaning. A look-up of a low-level cred will
map this to a machine credential.

Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 1de7eea9 02-Dec-2018 NeilBrown <neilb@suse.com>

SUNRPC: add side channel to use non-generic cred for rpc call.

The credential passed in rpc_message.rpc_cred is always a
generic credential except in one instance.
When gss_destroying_context() calls rpc_call_null(), it passes
a specific credential that it needs to destroy.
In this case the RPC acts *on* the credential rather than
being authorized by it.

This special case deserves explicit support and providing that will
mean that rpc_message.rpc_cred is *always* generic, allowing
some optimizations.

So add "tk_op_cred" to rpc_task and "rpc_op_cred" to the setup data.
Use this to pass the cred down from rpc_call_null(), and have
rpcauth_bindcred() notice it and bind it in place.

Credit to kernel test robot <fengguang.wu@intel.com> for finding
a bug in earlier version of this patch.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# f42f7c28 08-Sep-2018 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Fix priority queue fairness

Fix up the priority queue to not batch by owner, but by queue, so that
we allow '1 << priority' elements to be dequeued before switching to
the next priority queue.
The owner field is still used to wake up requests in round robin order
by owner to avoid single processes hogging the RPC layer by loading the
queues.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 5ce97039 07-Sep-2018 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Test whether the task is queued before grabbing the queue spinlocks

When asked to wake up an RPC task, it makes sense to test whether or not
the task is still queued.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# 359c48c0 29-Aug-2018 Trond Myklebust <trond.myklebust@hammerspace.com>

SUNRPC: Add a helper to wake up a sleeping rpc_task and set its status

Add a helper that will wake up a task that is sleeping on a specific
queue, and will set the value of task->tk_status. This is mainly
intended for use by the transport layer to notify the task of an
error condition.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>


# e671edb9 16-Mar-2018 Chuck Lever <chuck.lever@oracle.com>

sunrpc: Simplify synopsis of some trace points

Clean up: struct rpc_task carries a pointer to a struct rpc_clnt,
and in fact task->tk_client is always what is passed into trace
points that are already passing @task.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# f515f86b 29-Jun-2017 Olga Kornievskaia <aglo@umich.edu>

fix parallelism for rpc tasks

Hi folks,

On a multi-core machine, is it expected that we can have parallel RPCs
handled by each of the per-core workqueue?

In testing a read workload, observing via "top" command that a single
"kworker" thread is running servicing the requests (no parallelism).
It's more prominent while doing these operations over krb5p mount.

What has been suggested by Bruce is to try this and in my testing I
see then the read workload spread among all the kworker threads.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 90ea9f1b 06-Feb-2018 Trond Myklebust <trond.myklebust@primarydata.com>

Make the xprtiod workqueue unbounded.

This should help reduce the latency on replies.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 2275cde4 07-Feb-2018 Trond Myklebust <trond.myklebust@primarydata.com>

SUNRPC: Queue latency-sensitive socket tasks to xprtiod

The response to a write_space notification is very latency sensitive,
so we should queue it to the lower latency xprtiod_workqueue. This
is something we already do for the other cases where an rpc task
holds the transport XPRT_LOCKED bitlock.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 21ead9ff 03-Jan-2018 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Micro-optimize __rpc_execute

The common case: There are 13 to 14 actions per RPC, and tk_callback
is non-NULL in only one of them. There's no need to store a NULL in
the tk_callback field during each FSM step.

This slightly improves throughput results in dbench and other multi-
threaded benchmarks on my two-socket client on 56Gb InfiniBand, but
will probably be inconsequential on slower systems.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# cf08d6f2 03-Jan-2018 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: task_run_action should display tk_callback

This shows up in every RPC:

kworker/4:1-19772 [004] 3467.373443: rpc_task_run_action: task:4711@2 flags=0e81 state=0005 status=0 action=call_status
kworker/4:1-19772 [004] 3467.373444: rpc_task_run_action: task:4711@2 flags=0e81 state=0005 status=0 action=call_status

What's actually going on is that the first iteration of the RPC
scheduler is invoking the function in tk_callback (in this case,
xprt_timer), then invoking call_status on the next iteration.

Feeding do_action, rather than tk_action, to the "task_run_action"
trace point will now always display the correct FSM step.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# b2bfe591 03-Nov-2017 Chuck Lever <chuck.lever@oracle.com>

sunrpc: Fix rpc_task_begin trace point

The rpc_task_begin trace point always display a task ID of zero.
Move the trace point call site so that it picks up the new task ID.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# ff861c4d 16-Oct-2017 Kees Cook <keescook@chromium.org>

sunrpc: Convert timers to use timer_setup()

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-nfs@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 62b2417e 09-Apr-2017 NeilBrown <neilb@suse.com>

sunrpc: don't check for failure from mempool_alloc()

When mempool_alloc() is allowed to sleep (GFP_NOIO allows
sleeping) it cannot fail.
So rpc_alloc_task() cannot fail, so rpc_new_task doesn't need
to test for failure.
Consequently rpc_new_task() cannot fail, so the callers
don't need to test.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 68778945 15-Sep-2016 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Separate buffer pointers for RPC Call and Reply messages

For xprtrdma, the RPC Call and Reply buffers are involved in real
I/O operations.

To start with, the DMA direction of the I/O for a Call is opposite
that of a Reply.

In the current arrangement, the Reply buffer address is on a
four-byte alignment just past the call buffer. Would be friendlier
on some platforms if that was at a DMA cache alignment instead.

Because the current arrangement allocates a single memory region
which contains both buffers, the RPC Reply buffer often contains a
page boundary in it when the Call buffer is large enough (which is
frequent).

It would be a little nicer for setting up DMA operations (and
possible registration of the Reply buffer) if the two buffers were
separated, well-aligned, and contained as few page boundaries as
possible.

Now, I could just pad out the single memory region used for the pair
of buffers. But frequently that would mean a lot of unused space to
ensure the Reply buffer did not have a page boundary.

Add a separate pointer to rpc_rqst that points right to the RPC
Reply buffer. This makes no difference to xprtsock, but it will help
xprtrdma in subsequent patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 3435c74a 15-Sep-2016 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Generalize the RPC buffer release API

xprtrdma needs to allocate the Call and Reply buffers separately.
TBH, the reliance on using a single buffer for the pair of XDR
buffers is transport implementation-specific.

Instead of passing just the rq_buffer into the buf_free method, pass
the task structure and let buf_free take care of freeing both
XDR buffers at once.

There's a micro-optimization here. In the common case, both
xprt_release and the transport's buf_free method were checking if
rq_buffer was NULL. Now the check is done only once per RPC.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# 5fe6eaa1 15-Sep-2016 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Generalize the RPC buffer allocation API

xprtrdma needs to allocate the Call and Reply buffers separately.
TBH, the reliance on using a single buffer for the pair of XDR
buffers is transport implementation-specific.

Transports that want to allocate separate Call and Reply buffers
will ignore the "size" argument anyway. Don't bother passing it.

The buf_alloc method can't return two pointers. Instead, make the
method's return value an error code, and set the rq_buffer pointer
in the method itself.

This gives call_allocate an opportunity to terminate an RPC instead
of looping forever when a permanent problem occurs. If a request is
just bogus, or the transport is in a state where it can't allocate
resources for any request, there needs to be a way to kill the RPC
right there and not loop.

This immediately fixes a rare problem in the backchannel send path,
which loops if the server happens to send a CB request whose
call+reply size is larger than a page (which it shouldn't do yet).

One more issue: looks like xprt_inject_disconnect was incorrectly
placed in the failure path in call_allocate. It needs to be in the
success path, as it is for other call-sites.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>


# f1dc237c 26-May-2016 Trond Myklebust <trond.myklebust@primarydata.com>

SUNRPC: Reduce latency when send queue is congested

Use the low latency transport workqueue to process the task that is
next in line on the xprt->sending queue.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 40a5f1b1 27-May-2016 Trond Myklebust <trond.myklebust@primarydata.com>

SUNRPC: RPC transport queue must be low latency

rpciod can easily get congested due to the long list of queued rpc_tasks.
Having the receive queue wait in turn for those tasks to complete can
therefore be a bottleneck.

Address the problem by separating the workqueues into:
- rpciod: manages rpc_tasks
- xprtiod: manages transport related work.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 9d61498d 30-Jan-2016 Trond Myklebust <trond.myklebust@primarydata.com>

SUNRPC: Allow caller to specify the transport to use

This is needed in order to allow the NFSv4.1 backchannel and
BIND_CONN_TO_SESSION function to work.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# dfd01f02 13-Dec-2015 Peter Zijlstra <peterz@infradead.org>

sched/wait: Fix the signal handling fix

Jan Stancek reported that I wrecked things for him by fixing things for
Vladimir :/

His report was due to an UNINTERRUPTIBLE wait getting -EINTR, which
should not be possible, however my previous patch made this possible by
unconditionally checking signal_pending().

We cannot use current->state as was done previously, because the
instruction after the store to that variable it can be changed. We must
instead pass the initial state along and use that.

Fixes: 68985633bccb ("sched/wait: Fix signal handling in bit wait helpers")
Reported-by: Jan Stancek <jstancek@redhat.com>
Reported-by: Chris Mason <clm@fb.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Tested-by: Chris Mason <clm@fb.com>
Reviewed-by: Paul Turner <pjt@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: tglx@linutronix.de
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: hpa@zytor.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# ac5be6b4 22-Sep-2015 Andrea Arcangeli <aarcange@redhat.com>

userfaultfd: revert "userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key"

This reverts commit 51360155eccb907ff8635bd10fc7de876408c2e0 and adapts
fs/userfaultfd.c to use the old version of that function.

It didn't look robust to call __wake_up_common with "nr == 1" when we
absolutely require wakeall semantics, but we've full control of what we
insert in the two waitqueue heads of the blocked userfaults. No
exclusive waitqueue risks to be inserted into those two waitqueue heads
so we can as well stick to "nr == 1" of the old code and we can rely
purely on the fact no waitqueue inserted in one of the two waitqueue
heads we must enforce as wakeall, has wait->flags WQ_FLAG_EXCLUSIVE set.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Thierry Reding <treding@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 17a9618e 13-Sep-2015 Julia Lawall <Julia.Lawall@lip6.fr>

SUNRPC: drop null test before destroy functions

Remove unneeded NULL test.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@ expression x; @@
-if (x != NULL)
\(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 51360155 04-Sep-2015 Andrea Arcangeli <aarcange@redhat.com>

userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key

userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead
of the current hardcoded 1 (that would wake just the first waitqueue in
the head list).

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: zhang.zhanghailiang@huawei.com
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Huangpeng (Peter)" <peter.huangpeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 55cc1d78 13-Mar-2015 Nicholas Mc Guire <hofrat@osadl.org>

SUNRPC: fix build-warning due to format missmatch

fix build-warning introduced by commit: f0eede10fd4 ("SUNRPC: use
jiffies_to_msecs for converting jiffies") which did not fixup
the format properly (my bad).

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# f0eede10 03-Mar-2015 Nicholas Mc Guire <hofrat@osadl.org>

SUNRPC: use jiffies_to_msecs for converting jiffies

Use jiffies_to_msecs for converting jiffies as it handles all of the corner
cases reliably and also helps readability.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# c4a7ca77 23-Jan-2015 Trond Myklebust <trond.myklebust@primarydata.com>

SUNRPC: Allow waiting on memory allocation

We should be safe now, as long as we don't do GFP_IO or higher allocations

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 127b21b8 23-Jan-2015 Trond Myklebust <trond.myklebust@primarydata.com>

SUNRPC: Adjust rpciod workqueue parameters

Increase the concurrency level for rpciod threads to allow for allocations
etc that happen in the RPCSEC_GSS layer. Also note that the NFSv4 byte range
locks may now need to allocate memory from inside rpciod.

Add the WQ_HIGHPRI flag to improve latency guarantees while we're at it.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 1306729b 17-Nov-2014 Jeff Layton <jlayton@kernel.org>

sunrpc: eliminate RPC_TRACEPOINTS

It's always set to the same value as CONFIG_TRACEPOINTS, so we can just
use that instead.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# f895b252 17-Nov-2014 Jeff Layton <jlayton@kernel.org>

sunrpc: eliminate RPC_DEBUG

It's always set to whatever CONFIG_SUNRPC_DEBUG is, so just use that.

Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 1aff5256 23-Sep-2014 NeilBrown <neilb@suse.de>

NFS/SUNRPC: Remove other deadlock-avoidance mechanisms in nfs_release_page()

Now that nfs_release_page() doesn't block indefinitely, other deadlock
avoidance mechanisms aren't needed.
- it doesn't hurt for kswapd to block occasionally. If it doesn't
want to block it would clear __GFP_WAIT. The current_is_kswapd()
was only added to avoid deadlocks and we have a new approach for
that.
- memory allocation in the SUNRPC layer can very rarely try to
->releasepage() a page it is trying to handle. The deadlock
is removed as nfs_release_page() doesn't block indefinitely.

So we don't need to set PF_FSTRANS for sunrpc network operations any
more.

Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# c1221321 06-Jul-2014 NeilBrown <neilb@suse.de>

sched: Allow wait_on_bit_action() functions to support a timeout

It is currently not possible for various wait_on_bit functions
to implement a timeout.

While the "action" function that is called to do the waiting
could certainly use schedule_timeout(), there is no way to carry
forward the remaining timeout after a false wake-up.
As false-wakeups a clearly possible at least due to possible
hash collisions in bit_waitqueue(), this is a real problem.

The 'action' function is currently passed a pointer to the word
containing the bit being waited on. No current action functions
use this pointer. So changing it to something else will be a
little noisy but will have no immediate effect.

This patch changes the 'action' function to take a pointer to
the "struct wait_bit_key", which contains a pointer to the word
containing the bit so nothing is really lost.

It also adds a 'private' field to "struct wait_bit_key", which
is initialized to zero.

An action function can now implement a timeout with something
like

static int timed_out_waiter(struct wait_bit_key *key)
{
unsigned long waited;
if (key->private == 0) {
key->private = jiffies;
if (key->private == 0)
key->private -= 1;
}
waited = jiffies - key->private;
if (waited > 10 * HZ)
return -EAGAIN;
schedule_timeout(waited - 10 * HZ);
return 0;
}

If any other need for context in a waiter were found it would be
easy to use ->private for some other purpose, or even extend
"struct wait_bit_key".

My particular need is to support timeouts in nfs_release_page()
to avoid deadlocks with loopback mounted NFS.

While wait_on_bit_timeout() would be a cleaner interface, it
will not meet my need. I need the timeout to be sensitive to
the state of the connection with the server, which could change.
So I need to use an 'action' interface.

Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140707051604.28027.41257.stgit@notabene.brown
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# c6c8fe79 07-May-2014 David Rientjes <rientjes@google.com>

net, sunrpc: suppress allocation warning in rpc_malloc()

rpc_malloc() allocates with GFP_NOWAIT without making any attempt at
reclaim so it easily fails when low on memory. This ends up spamming the
kernel log:

SLAB: Unable to allocate memory on node 0 (gfp=0x4000)
cache: kmalloc-8192, object size: 8192, order: 1
node 0: slabs: 207/207, objs: 207/207, free: 0
rekonq: page allocation failure: order:1, mode:0x204000
CPU: 2 PID: 14321 Comm: rekonq Tainted: G O 3.15.0-rc3-12.gfc9498b-desktop+ #6
Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 2105 07/23/2010
0000000000000000 ffff880010ff17d0 ffffffff815e693c 0000000000204000
ffff880010ff1858 ffffffff81137bd2 0000000000000000 0000001000000000
ffff88011ffebc38 0000000000000001 0000000000204000 ffff88011ffea000
Call Trace:
[<ffffffff815e693c>] dump_stack+0x4d/0x6f
[<ffffffff81137bd2>] warn_alloc_failed+0xd2/0x140
[<ffffffff8113be19>] __alloc_pages_nodemask+0x7e9/0xa30
[<ffffffff811824a8>] kmem_getpages+0x58/0x140
[<ffffffff81183de6>] fallback_alloc+0x1d6/0x210
[<ffffffff81183be3>] ____cache_alloc_node+0x123/0x150
[<ffffffff81185953>] __kmalloc+0x203/0x490
[<ffffffffa06b0ee2>] rpc_malloc+0x32/0xa0 [sunrpc]
[<ffffffffa06a6999>] call_allocate+0xb9/0x170 [sunrpc]
[<ffffffffa06b19d8>] __rpc_execute+0x88/0x460 [sunrpc]
[<ffffffffa06b2da9>] rpc_execute+0x59/0xc0 [sunrpc]
[<ffffffffa06a932b>] rpc_run_task+0x6b/0x90 [sunrpc]
[<ffffffffa077b5c1>] nfs4_call_sync_sequence+0x51/0x80 [nfsv4]
[<ffffffffa077d45d>] _nfs4_do_setattr+0x1ed/0x280 [nfsv4]
[<ffffffffa0782a72>] nfs4_do_setattr+0x72/0x180 [nfsv4]
[<ffffffffa078334c>] nfs4_proc_setattr+0xbc/0x140 [nfsv4]
[<ffffffffa074a7e8>] nfs_setattr+0xd8/0x240 [nfs]
[<ffffffff811baa71>] notify_change+0x231/0x380
[<ffffffff8119cf5c>] chmod_common+0xfc/0x120
[<ffffffff8119df80>] SyS_chmod+0x40/0x90
[<ffffffff815f4cfd>] system_call_fastpath+0x1a/0x1f
...

If the allocation fails, simply return NULL and avoid spamming the kernel
log.

Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 6bd14416 19-Mar-2014 Trond Myklebust <trond.myklebust@primarydata.com>

SUNRPC: Don't let rpc_delay() clobber non-timeout errors

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>


# 8d1018c7 04-Sep-2013 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Ensure rpc_task->tk_pid is available for tracepoints

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 9ec2ef53 22-May-2013 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Remove redundant call to rpc_set_running() in __rpc_execute()

The RPC_TASK_RUNNING flag will always have been set in rpc_make_runnable()
once we get past the test for out_of_line_wait_on_bit() returning
ERESTARTSYS.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 0053a8e6 20-May-2013 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Remove unused function rpc_queue_empty

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# a76580fb 20-May-2013 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix a potential race in rpc_execute

If the rpc_task is asynchronous, it could theoretically finish executing
on the workqueue it was assigned by rpc_make_runnable() before we get
round to testing RPC_IS_ASYNC() in rpc_execute.

In practice, however, all the existing callers hold a reference to the
rpc_task, so this can't happen today...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# a3c3cac5 21-May-2013 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Prevent an rpc_task wakeup race

The lockless RPC_IS_QUEUED() test in __rpc_execute means that we need to
be careful about ordering the calls to rpc_test_and_set_running(task) and
rpc_clear_queued(task). If we get the order wrong, then we may end up
testing the RPC_TASK_RUNNING flag after __rpc_execute() has looped
and changed the state of the rpc_task.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org


# 416ad3c9 06-May-2013 Colin Cross <ccross@android.com>

freezer: add unsafe versions of freezable helpers for NFS

NFS calls the freezable helpers with locks held, which is unsafe
and will cause lockdep warnings when 6aa9707 "lockdep: check
that no locks held at freeze time" is reapplied (it was reverted
in dbf520a). NFS shouldn't be doing this, but it has
long-running syscalls that must hold a lock but also shouldn't
block suspend. Until NFS freeze handling is rewritten to use a
signal to exit out of the critical section, add new *_unsafe
versions of the helpers that will not run the lockdep test when
6aa9707 is reapplied, and call them from NFS.

In practice the likley result of holding the lock while freezing
is that a second task blocked on the lock will never freeze,
aborting suspend, but it is possible to manufacture a case using
the cgroup freezer, the lock, and the suspend freezer to create
a deadlock. Silencing the lockdep warning here will allow
problems to be found in other drivers that may have a more
serious deadlock risk, and prevent new problems from being added.

Signed-off-by: Colin Cross <ccross@android.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


# 1166fde6 25-Mar-2013 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked

We need to be careful when testing task->tk_waitqueue in
rpc_wake_up_task_queue_locked, because it can be changed while we
are holding the queue->lock.
By adding appropriate memory barriers, we can ensure that it is safe to
test task->tk_waitqueue for equality if the RPC_TASK_QUEUED bit is set.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org


# edd2e36f 27-Jan-2013 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: When changing the queue priority, ensure that we change the owner

This fixes a livelock in the xprt->sending queue where we end up never
making progress on lower priority tasks because sleep_on_priority()
keeps adding new tasks with the same owner to the head of the queue,
and priority bumps mean that we keep resetting the queue->owner to
whatever task is at the head of the queue.

Regression introduced by commit c05eecf636101dd4347b2d8fa457626bf0088e0a
(SUNRPC: Don't allow low priority tasks to pre-empt higher priority ones).

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 87ed5003 07-Jan-2013 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Ensure we release the socket write lock if the rpc_task exits early

If the rpc_task exits while holding the socket write lock before it has
allocated an rpc slot, then the usual mechanism for releasing the write
lock in xprt_release() is defeated.

The problem occurs if the call to xprt_lock_write() initially fails, so
that the rpc_task is put on the xprt->sending wait queue. If the task
exits after being assigned the lock by __xprt_lock_write_func, but
before it has retried the call to xprt_lock_and_alloc_slot(), then
it calls xprt_release() while holding the write lock, but will
immediately exit due to the test for task->tk_rqstp != NULL.

Reported-by: Chris Perl <chris.perl@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org [>= 3.1]


# c6567ed1 03-Jan-2013 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Ensure that we free the rpc_task after cleanups are done

This patch ensures that we free the rpc_task after the cleanup callbacks
are done in order to avoid a deadlock problem that can be triggered if
the callback needs to wait for another workqueue item to complete.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Weston Andros Adamson <dros@netapp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Bruce Fields <bfields@fieldses.org>
Cc: stable@vger.kernel.org


# c05eecf6 30-Nov-2012 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Don't allow low priority tasks to pre-empt higher priority ones

Currently, the priority queues attempt to be 'fair' to lower priority
tasks by scheduling them after a certain number of higher priority tasks
have run. The problem is that both the transport send queue and
the NFSv4.1 session slot queue have strong ordering requirements.

This patch therefore removes the fairness code in favour of strong
ordering of task priorities.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 1e1093c7 01-Nov-2012 Trond Myklebust <Trond.Myklebust@netapp.com>

NFSv4.1: Don't mess with task priorities in nfs41_setup_sequence

We want to preserve the rpc_task priority for things like writebacks,
that may have differing levels of urgency.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 0a0c2a57 23-Oct-2012 Weston Andros Adamson <dros@netapp.com>

SUNRPC: remove BUG_ON in rpc_release_task

Replace BUG_ON() with WARN_ON_ONCE().

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 2bd4eef8 23-Oct-2012 Weston Andros Adamson <dros@netapp.com>

SUNRPC: remove BUG_ONs checking RPC_IS_QUEUED

Replace two BUG_ON() calls with WARN_ON_ONCE() and early returns.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# f50ad428 23-Oct-2012 Weston Andros Adamson <dros@netapp.com>

SUNRPC: remove BUG_ON from __rpc_sleep_on_priority

Replace BUG_ON() with WARN_ON_ONCE().

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# e454a7a8 23-Oct-2012 Weston Andros Adamson <dros@netapp.com>

SUNRPC: remove BUG_ON from rpc_sleep_on*

Replace BUG_ON() with WARN_ON_ONCE() and clean up after inactive task.

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 9b96ce71 28-Sep-2012 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Limit the rpciod workqueue concurrency

We shouldn't need more than 1 worker thread per cpu, since rpciod
is designed to run without sleeping in most cases.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# a564b8f0 31-Jul-2012 Mel Gorman <mgorman@suse.de>

nfs: enable swap on NFS

Implement the new swapfile a_ops for NFS and hook up ->direct_IO. This
will set the NFS socket to SOCK_MEMALLOC and run socket reconnect under
PF_MEMALLOC as well as reset SOCK_MEMALLOC before engaging the protocol
->connect() method.

PF_MEMALLOC should allow the allocation of struct socket and related
objects and the early (re)setting of SOCK_MEMALLOC should allow us to
receive the packets required for the TCP connection buildup.

[jlayton@redhat.com: Restore PF_MEMALLOC task flags in all cases]
[dfeng@redhat.com: Fix handling of multiple swap files]
[a.p.zijlstra@chello.nl: Original patch]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 5cf02d09 23-Jul-2012 Jeff Layton <jlayton@kernel.org>

nfs: skip commit in releasepage if we're freeing memory for fs-related reasons

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14"
#0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
#1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
#2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
#3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
#4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
#5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
#24 [ffff8810343bfee8] kthread at ffffffff8108dd96
#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org


# 506026c3 23-Jul-2012 Jeff Layton <jlayton@kernel.org>

sunrpc: clarify comments on rpc_make_runnable

rpc_make_runnable is not generally called with the queue lock held, unless
it's waking up a task that has been sitting on a waitqueue. This is safe
when the task has not entered the FSM yet, but the comments don't really
spell this out.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 540a0f75 19-Mar-2012 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up()

The problem is that for the case of priority queues, we
have to assume that __rpc_remove_wait_queue_priority will move new
elements from the tk_wait.links lists into the queue->tasks[] list.
We therefore cannot use list_for_each_entry_safe() on queue->tasks[],
since that will skip these new tasks that __rpc_remove_wait_queue_priority
is adding.

Without this fix, rpc_wake_up and rpc_wake_up_status will both fail
to wake up all functions on priority wait queues, which can result
in some nasty hangs.

Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org


# 2f09c242 08-Feb-2012 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Ensure that we can trace waitqueues when !defined(CONFIG_SYSCTL)

The tracepoint code relies on the queue->name being defined in order to
be able to display the name of the waitqueue on which an RPC task is
sleeping.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>


# 82b0a4c3 20-Jan-2012 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Add trace events to the sunrpc subsystem

Add declarations to allow tracing of RPC call creation, running, sleeping,
and destruction.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 961a828d 17-Jan-2012 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix potential races in xprt_lock_write_next()

We have to ensure that the wake up from the waitqueue and the assignment
of xprt->snd_task are atomic. We can do this by assigning the snd_task
while under the waitqueue spinlock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# d310310c 01-Dec-2011 Jeff Layton <jlayton@kernel.org>

Freezer / sunrpc / NFS: don't allow TASK_KILLABLE sleeps to block the freezer

Allow the freezer to skip wait_on_bit_killable sleeps in the sunrpc
layer. This should allow suspend and hibernate events to proceed, even
when there are RPC's pending on the wire.

Also, wrap the TASK_KILLABLE sleeps in NFS layer in freezer_do_not_count
and freezer_count calls. This allows the freezer to skip tasks that are
sleeping while looping on EJUKEBOX or NFS4ERR_DELAY sorts of errors.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>


# 7fdcf13b 01-Dec-2011 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix the execution time statistics in the face of RPC restarts

If the rpc_task gets restarted, then we want to ensure that we don't
double-count the execution time statistics, timeout data, etc.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 3b27bad7 17-Jul-2011 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Allow caller of rpc_sleep_on() to select priority levels

Currently, the caller has to change the value of task->tk_priority if
it wants to select on which priority level the task will sleep.

This patch allows the caller to select a priority level at sleep time
rather than always using task->tk_priority.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# b55c5989 06-Jul-2011 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix a race between work-queue and rpc_killall_tasks

Since rpc_killall_tasks may modify the rpc_task's tk_action field
without any locking, we need to be careful when dereferencing it.

Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org


# 0b760113 31-May-2011 Trond Myklebust <Trond.Myklebust@netapp.com>

NLM: Don't hang forever on NLM unlock requests

If the NLM daemon is killed on the NFS server, we can currently end up
hanging forever on an 'unlock' request, instead of aborting. Basically,
if the rpcbind request fails, or the server keeps returning garbage, we
really want to quit instead of retrying.

Tested-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org


# a271c5a0 27-Mar-2011 OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

NFS: Ensure that rpc_release_resources_task() can be called twice.

BUG: atomic_dec_and_test(): -1: atomic counter underflow at:
Pid: 2827, comm: mount.nfs Not tainted 2.6.38 #1
Call Trace:
[<ffffffffa02223a0>] ? put_rpccred+0x44/0x14e [sunrpc]
[<ffffffffa021bbe9>] ? rpc_ping+0x4e/0x58 [sunrpc]
[<ffffffffa021c4a5>] ? rpc_create+0x481/0x4fc [sunrpc]
[<ffffffffa022298a>] ? rpcauth_lookup_credcache+0xab/0x22d [sunrpc]
[<ffffffffa028be8c>] ? nfs_create_rpc_client+0xa6/0xeb [nfs]
[<ffffffffa028c660>] ? nfs4_set_client+0xc2/0x1f9 [nfs]
[<ffffffffa028cd3c>] ? nfs4_create_server+0xf2/0x2a6 [nfs]
[<ffffffffa0295d07>] ? nfs4_remote_mount+0x4e/0x14a [nfs]
[<ffffffff810dd570>] ? vfs_kern_mount+0x6e/0x133
[<ffffffffa029605a>] ? nfs_do_root_mount+0x76/0x95 [nfs]
[<ffffffffa029643d>] ? nfs4_try_mount+0x56/0xaf [nfs]
[<ffffffffa0297434>] ? nfs_get_sb+0x435/0x73c [nfs]
[<ffffffff810dd59b>] ? vfs_kern_mount+0x99/0x133
[<ffffffff810dd693>] ? do_kern_mount+0x48/0xd8
[<ffffffff810f5b75>] ? do_mount+0x6da/0x741
[<ffffffff810f5c5f>] ? sys_mount+0x83/0xc0
[<ffffffff8100293b>] ? system_call_fastpath+0x16/0x1b

Well, so, I think this is real bug of nfs codes somewhere. With some
review, the code

rpc_call_sync()
rpc_run_task
rpc_execute()
__rpc_execute()
rpc_release_task()
rpc_release_resources_task()
put_rpccred() <= release cred
rpc_put_task
rpc_do_put_task()
rpc_release_resources_task()
put_rpccred() <= release cred again

seems to be release cred unintendedly.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# e020c680 15-Mar-2011 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Ensure we always run the tk_callback before tk_action

This fixes a race in which the task->tk_callback() puts the rpc_task
to sleep, setting a new callback. Under certain circumstances, the current
code may end up executing the task->tk_action before it gets round to the
callback.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org


# eabf5baa 11-Feb-2011 Fred Isaman <iisaman@netapp.com>

RPC: clarify rpc_run_task error handling

rpc_run_task can only fail if it is not passed in a preallocated task.
However, that is not at all clear with the current code. So
remove several impossible to occur failure checks.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# cee6a537 11-Feb-2011 Fred Isaman <iisaman@netapp.com>

RPC: remove check for impossible condition in rpc_make_runnable

queue_work() only returns 0 or 1, never a negative value.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# bf294b41 21-Feb-2011 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Close a race in __rpc_wait_for_completion_task()

Although they run as rpciod background tasks, under normal operation
(i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
and nfs4_do_close() want to be fully synchronous. This means that when we
exit, we want all references to the rpc_task to be gone, and we want
any dentry references etc. held by that task to be released.

For this reason these functions call __rpc_wait_for_completion_task(),
followed by rpc_put_task() in the expectation that the latter will be
releasing the last reference to the rpc_task, and thus ensuring that the
callback_ops->rpc_release() has been called synchronously.

This patch fixes a race which exists due to the fact that
rpciod calls rpc_complete_task() (in order to wake up the callers of
__rpc_wait_for_completion_task()) and then subsequently calls
rpc_put_task() without ensuring that these two steps are done atomically.

In order to avoid adding new spin locks, the patch uses the existing
waitqueue spin lock to order the rpc_task reference count releases between
the waiting process and rpciod.
The common case where nobody is waiting for completion is optimised for by
checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
reference count is 1: in those cases we drop trying to grab the spin lock,
and immediately free up the rpc_task.

Those few processes that need to put the rpc_task from inside an
asynchronous context and that do not care about ordering are given a new
helper: rpc_put_task_async().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# ada609ee 25-Jan-2011 Tejun Heo <tj@kernel.org>

workqueue: use WQ_MEM_RECLAIM instead of WQ_RESCUER

WQ_RESCUER is now an internal flag and should only be used in the
workqueue implementation proper. Use WQ_MEM_RECLAIM instead.

This doesn't introduce any functional difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: dm-devel@redhat.com
Cc: Neil Brown <neilb@suse.de>


# a02cec21 22-Sep-2010 Eric Dumazet <eric.dumazet@gmail.com>

net: return operator cleanup

Change "return (EXPR);" to "return EXPR;"

return is not a function, parentheses are not required.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4fbf6e50 21-Sep-2010 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Convert rpciod to use the alloc_workqueue() interface

create_workqueue() is a deprecated function.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# d6a1ed08 31-Jul-2010 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Reduce asynchronous RPC task stack usage

We should just farm out asynchronous RPC tasks immediately to rpciod...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# a17c2153 31-Jul-2010 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Move the bound cred to struct rpc_rqst

This will allow us to save the original generic cred in rpc_message, so
that if we migrate from one server to another, we can generate a new bound
cred without having to punt back to the NFS layer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 58f9612c 31-Jul-2010 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Move remaining RPC client related task initialisation into clnt.c

Now that rpc_run_task() is the sole entry point for RPC calls, we can move
the remaining rpc_client-related initialisation of struct rpc_task from
sched.c into clnt.c.

Also move rpc_killall_tasks() into the same file, since that too is
relative to the rpc_clnt.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# d9b6cd94 31-Jul-2010 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Ensure that rpc_exit() always wakes up a sleeping task

Make rpc_exit() non-inline, and ensure that it always wakes up a task that
has been queued.

Kill off the now unused rpc_wake_up_task().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# d72b6cec 12-May-2010 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Remove the 'tk_magic' debugging field

It has not triggered in almost a decade. Time to get rid of it...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# ff839970 07-May-2010 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Replace jiffies-based metrics with ktime-based metrics

Currently RPC performance metrics that tabulate elapsed time use
jiffies time values. This is problematic on systems that use slow
jiffies (for instance 100HZ systems built for paravirtualized
environments). It is also a problem for computing precise latency
statistics for advanced network transports, such as InfiniBand,
that can have round-trip latencies significanly faster than a single
clock tick.

For the RPC client, adopt the high resolution time stamp mechanism
already used by the network layer and blktrace: ktime.

We use ktime format time stamps for all internal computations, and
convert to milliseconds for presentation. As a result, we need only
addition operations in the performance critical paths; multiply/divide
is required only for presentation.

We could report RTT metrics in microseconds. In fact the mountstats
format is versioned to accomodate exactly this kind of interface
improvement.

For now, however, we'll stay with millisecond precision for
presentation to maintain backwards compatibility with the handful of
currently deployed user space tools. At a later point, we'll move to
an API such as BDI_STATS where a finer timestamp precision can be
reported.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 19445b99 16-Apr-2010 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Cleanup - make rpc_new_task() call rpc_release_calldata on failure

Also have it return an ERR_PTR(-ENOMEM) instead of a null pointer.

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 689cf5c1 14-Dec-2009 Alexandros Batsakis <batsakis@netapp.com>

nfs: enforce FIFO ordering of operations trying to acquire slot

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 48f18612 14-Dec-2009 Alexandros Batsakis <batsakis@netapp.com>

rpc: add rpc_queue_empty function

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 6951867b 09-Sep-2009 Benny Halevy <bhalevy@panasas.com>

nfsd41: sunrpc: move struct rpc_buffer def into sunrpc.h

Move struct rpc_buffer's definition into a sunrpc.h, a common, internal
header file, in preparation for supporting the nfsv4.1 backchannel.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
[nfs41: sunrpc: #include <linux/net.h> from sunrpc.h]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>


# 405f5571 11-Jul-2009 Alexey Dobriyan <adobriyan@gmail.com>

headers: smp_lock.h redux

* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

This will make hardirq.h inclusion cheaper for every PREEMPT=n config
(which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# aae2006e 01-Apr-2009 Andy Adamson <andros@netapp.com>

nfs41: sunrpc: Export the call prepare state for session reset

Signed-off-by: Andy Adamson<andros@netapp.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# eb9b55ab 10-Mar-2009 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Tighten up the task locking rules in __rpc_execute()

We should probably not be testing any flags after we've cleared the
RPC_TASK_RUNNING flag, since rpc_make_runnable() is then free to assign the
rpc_task to another workqueue, which may then destroy it.

We can fix any races with rpc_make_runnable() by ensuring that we only
clear the RPC_TASK_RUNNING flag while holding the rpc_wait_queue->lock that
the task is supposed to be sleeping on (and then checking whether or not
the task really is sleeping).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# a86dc496 11-Jun-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Remove the BKL from the callback functions

Push it into those callback functions that actually need it.

Note that all the NFS operations use their own locking, so don't need the
BKL. Ditto for the rpcbind client.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# a486aeda 09-Jun-2008 \\\"J. Bruce Fields\\\ <bfields@citi.umich.edu>

rpc: minor cleanup of scheduler callback code

Try to make the comment here a little more clear and concise.

Also, this macro definition seems unnecessary.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 4ccda2cd 12-Mar-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Clean up rpcauth_bindcred()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# af093835 11-Mar-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix RPCAUTH_LOOKUP_ROOTCREDS

The current RPCAUTH_LOOKUP_ROOTCREDS flag only works for AUTH_SYS
authentication, and then only as a special case in the code. This patch
removes the auth_sys special casing, and replaces it with generic code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 5e4424af 25-Feb-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Remove now-redundant RCU-safe rpc_task free path

Now that we've tightened up the locking rules for RPC queue wakeups, we can
remove the RCU-safe kfree calls...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# f5fb7b06 25-Feb-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Eliminate the now-redundant rpc_start_wakeup()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# eb276c0e 22-Feb-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Switch tasks to using the rpc_waitqueue's timer function

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 36df9aae 18-Jul-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Add a timer function to wait queues.

This is designed to replace the timeout timer in the individual rpc_tasks.
By putting the timer function in the wait queue, we will eventually be able
to reduce the total number of timers in use by the RPC subsystem.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# f6a1cc89 22-Feb-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Add a (empty for the moment) destructor for rpc_wait_queues

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 5d00837b 22-Feb-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Run rpc timeout functions as callbacks instead of in softirqs

An audit of the current RPC timeout functions shows that they don't really
ever need to run in the softirq context. As long as the softirq is
able to signal that the wakeup is due to a timeout (which it can do by
setting task->tk_status to -ETIMEDOUT) then the callback functions can just
run as standard task->tk_callback functions (in the rpciod/process
context).

The only possible border-line case would be xprt_timer() for the case of
UDP, when the callback is used to reduce the size of the transport
congestion window. In testing, however, the effect of moving that update
to a callback would appear to be minor.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# fda13939 22-Feb-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Convert users of rpc_wake_up_task to use rpc_wake_up_queued_task

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 96ef13b2 22-Feb-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Add a new helper rpc_wake_up_queued_task()

In all cases where we currently use rpc_wake_up_task(), we almost always
know on which waitqueue the rpc_task is actually sleeping. This will allows
us to simplify the queue locking in a future patch.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# fde95c75 22-Feb-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Clean up rpc_run_timer()

All RPC timeout callback functions are expected to wake the task up. We can
enforce this by moving the wakeup back into rpc_run_timer.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 32bfb5c0 19-Feb-2008 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Allow the rpc_release() callback to be run on another workqueue

A lot of the work done by the rpc_release() callback is inappropriate for
rpciod as it will often involve things like starting a new rpc call in
order to clean up state after an interrupted NFSv4 open() call, or
calls to mntput(), etc.

This patch allows the caller of rpc_run_task() to specify that the
rpc_release callback should run on a different workqueue than the default
rpciod_workqueue.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# a4a87499 18-Jul-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Cleanup to remove the last users of the RPC_WAITQ declaration

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 47fe0648 25-Oct-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Unexport rpc_init_task() and rpc_execute()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# e8f5d77c 25-Oct-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: allow the caller of rpc_run_task to preallocate the struct rpc_task

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# b3ef8b3b 25-Oct-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Allow rpc_init_task() to initialise the rpc_task->tk_msg

In preparation for the removal of rpc_call_setup().

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 3ff7576d 14-Jul-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Clean up the initialisation of priority queue scheduling info.

We want the default scheduling priority (priority == 0) to remain
RPC_PRIORITY_NORMAL.

Also ensure that the priority wait queue scheduling is per process id
instead of sometimes being per thread, and sometimes being per inode.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 84115e1c 14-Jul-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Cleanup of rpc_task initialisation

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# e8914c65 14-Jul-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Restrict sunrpc client exports

The sunrpc client exports are not meant to be part of any official kernel
API: they can change at the drop of a hat. Mark them as internal functions
using EXPORT_SYMBOL_GPL.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# a6eaf8bd 14-Jul-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Move exported declarations to the function declarations

Do this for all RPC client related functions and XDR functions.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# b24b8a24 23-Jan-2008 Pavel Emelyanov <xemul@openvz.org>

[NET]: Convert init_timer into setup_timer

Many-many code in the kernel initialized the timer->function
and timer->data together with calling init_timer(timer). There
is already a helper for this. Use it for networking code.

The patch is HUGE, but makes the code 130 lines shorter
(98 insertions(+), 228 deletions(-)).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 150030b7 06-Dec-2007 Matthew Wilcox <willy@infradead.org>

NFS: Switch from intr mount option to TASK_KILLABLE

By using the TASK_KILLABLE infrastructure, we can get rid of the 'intr'
mount option. We have to use _killable everywhere instead of _interruptible
as we get rid of rpc_clnt_sigmask/sigunmask.

Signed-off-by: Liam R. Howlett <howlett@gmail.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>


# ba25f9dc 19-Oct-2007 Pavel Emelyanov <xemul@openvz.org>

Use helpers to obtain task pid in printks

The task_struct->pid member is going to be deprecated, so start
using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in
the kernel.

The first thing to start with is the pid, printed to dmesg - in
this case we may safely use task_pid_nr(). Besides, printks produce
more (much more) than a half of all the explicit pid usage.

[akpm@linux-foundation.org: git-drm went and changed lots of stuff]
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 12444809 10-Sep-2007 \"Talpey, Thomas\ <Thomas.Talpey@netapp.com>

SUNRPC: add EXPORT_SYMBOL_GPL for generic transport functions

SUNRPC: add EXPORT_SYMBOL_GPL for generic transport functions

As a preface to allowing arbitrary transport modules to be loaded
dynamically, add EXPORT_SYMBOL_GPL for all generic transport functions
that a transport implementation might want to use.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Acked-by: Tom Talpey <tmt@netapp.com>

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# b247bbf1 19-Jul-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix a race in rpciod_down()

The commit 4ada539ed77c7a2bbcb75cafbbd7bd8d2b9bef7b lead to the unpleasant
possibility of an asynchronous rpc_task being required to call
rpciod_down() when it is complete. This again means that the rpciod
workqueue may get to call destroy_workqueue on itself -> hang...

Change rpciod_up/rpciod_down to just get/put the module, and then
create/destroy the workqueues on module load/unload.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 20c2df83 19-Jul-2007 Paul Mundt <lethal@linux-sh.org>

mm: Remove slab destructors from kmem_cache_create().

Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt <lethal@linux-sh.org>


# 6e5b70e9 12-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: clean up rpc_call_async/rpc_call_sync/rpc_run_task

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 188fef11 16-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Move rpc_register_client and friends into net/sunrpc/clnt.c

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 4ada539e 14-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Make create_client() take a reference to the rpciod workqueue

Ensures that an rpc_client always has the possibility to send asynchronous
RPC calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# ab418d70 14-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Optimise rpciod_up()

Instead of taking the mutex every time we just need to increment/decrement
rpciod_users, we can optmise by using atomic_inc_not_zero and
atomic_dec_and_test.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 90c5755f 09-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Kill rpc_clnt->cl_oneshot

Replace it with explicit calls to rpc_shutdown_client() or
rpc_destroy_client() (for the case of asynchronous calls).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 34f52e35 14-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Convert rpc_clnt->cl_users to a kref

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# c44fe705 16-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Clean up tk_pid allocation and make it lockless

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 4bef61ff 16-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Add a per-rpc_clnt spinlock

Use that to protect the rpc_clnt->cl_tasks list instead of using a global
lock.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 6529eba0 14-Jun-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Move rpc_task->tk_task list into struct rpc_clnt

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 9c9cc93a 09-Feb-2007 Christoph Hellwig <hch@infradead.org>

SUNRPC: remove dead variable 'rpciod_running'

rpciod_running is not used at all, but due to the way DECLARE_MUTEX_LOCKED
works we don't get a warning for it.


Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# ddce40df 09-May-2007 Peter Zijlstra <a.p.zijlstra@chello.nl>

sunrpc: fix crash in rpc_malloc()


While the comment says:
* To prevent rpciod from hanging, this allocator never sleeps,
* returning NULL if the request cannot be serviced immediately.

The function does not actually check for NULL pointers being returned.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# aa3d1fae 08-May-2007 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Fix pointer arithmetic bug recently introduced in rpc_malloc/free

Use a cleaner method to find the size of an rpc_buffer. This actually
works on x86-64!

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 215d0678 08-May-2007 Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>

Fix sunrpc warning noise

Commit c5a4dd8b7c15927a8fbff83171b57cad675a79b9 introduced the following
compiler warnings:

net/sunrpc/sched.c:766: warning: format '%u' expects type 'unsigned int', but argument 3 has type 'size_t'
net/sunrpc/sched.c:785: warning: format '%u' expects type 'unsigned int', but argument 2 has type 'size_t'

- Use %zu to format size_t
- Kill 2 useless casts

Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c5a4dd8b 29-Mar-2007 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Eliminate side effects from rpc_malloc

Currently rpc_malloc sets req->rq_buffer internally. Make this a more
generic interface: return a pointer to the new buffer (or NULL) and
make the caller set req->rq_buffer and req->rq_bufsize. This looks much
more like kmalloc and eliminates the side effects.

To fix a potential deadlock, this patch also replaces GFP_NOFS with
GFP_NOWAIT in rpc_malloc. This prevents async RPCs from sleeping outside
the RPC's task scheduler while allocating their buffer.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# cca5172a 09-Feb-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

[NET] SUNRPC: Fix whitespace errors.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 46121cf7 30-Jan-2007 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: fix print format for tk_pid

The tk_pid field is an unsigned short. The proper print format specifier for
that type is %5u, not %4d.

Also clean up some miscellaneous print formatting nits.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 2efef837 03-Feb-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

RPC: Clean up rpc_execute...

The error values are already propagated through task->tk_status, and
none of the callers check one without checking the other, so we can
drop the return value.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# bde8f00c 24-Jan-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

[PATCH] NFS: Fix Oops in rpc_call_sync()

Fix the Oops in http://bugzilla.linux-nfs.org/show_bug.cgi?id=138
We shouldn't be calling rpc_release_task() for tasks that are not active.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e18b890b 06-Dec-2006 Christoph Lameter <clameter@sgi.com>

[PATCH] slab: remove kmem_cache_t

Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

#!/bin/sh
#
# Replace one string by another in all the kernel sources.
#

set -e

for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
quilt add $file
sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
mv /tmp/$$ $file
quilt refresh
done

The script was run like this

sh replace kmem_cache_t "struct kmem_cache"

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 6d5fcb5a 18-Oct-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Remove BKL around the RPC socket operations etc.

All internal RPC client operations should no longer depend on the BKL,
however lockd and NFS callbacks may still require it.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# bbd5a1f9 18-Oct-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix up missing BKL in asynchronous RPC callback functions

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 8aca67f0 13-Nov-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Fix a potential race in rpc_wake_up_task()

Use RCU to ensure that we can safely call rpc_finish_wakeup after we've
called __rpc_do_wake_up_task. If not, there is a theoretical race, in which
the rpc_task finishes executing, and gets freed first.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# e6b3c4db 11-Nov-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

Fix a second potential rpc_wakeup race...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# cc4dc59e 05-Nov-2006 Christophe Saout <christophe@saout.de>

Subject: Re: [PATCH] Fix SUNRPC wakeup/execute race condition

The sunrpc scheduler contains a race condition that can let an RPC
task end up being neither running nor on any wait queue. The race takes
place between rpc_make_runnable (called from rpc_wake_up_task) and
__rpc_execute under the following condition:

First __rpc_execute calls tk_action which puts the task on some wait
queue. The task is dequeued by another process before __rpc_execute
continues its execution. While executing rpc_make_runnable exactly after
setting the task `running' bit and before clearing the `queued' bit
__rpc_execute picks up execution, clears `running' and subsequently
both functions fall through, both under the false assumption somebody
else took the job.

Swapping rpc_test_and_set_running with rpc_clear_queued in
rpc_make_runnable fixes that hole. This introduces another possible
race condition that can be handled by checking for `queued' after
setting the `running' bit.

Bug noticed on a 4-way x86_64 system under XEN with an NFSv4 server
on the same physical machine, apparently one of the few ways to hit
this race condition at all.

Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Christophe Saout <christophe@saout.de>
Signed-off-by: Trond Myklebust <trond.myklebust@fys.uio.no>


# 65f27f38 22-Nov-2006 David Howells <dhowells@redhat.com>

WorkStruct: Pass the work_struct pointer instead of context data

Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.

For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.

To make this work, an extra flag is introduced into the management side of the
work_struct. This governs auto-release of the structure upon execution.

Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function. This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated.. This is a
problem if the auxiliary data is taken away (as done by the last patch).

However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems. But then the work function must itself release the
work_struct by calling work_release().

In most cases, automatic release is fine, so this is the default. Special
initiators exist for the non-auto-release case (ending in _NAR).


Signed-Off-By: David Howells <dhowells@redhat.com>


# 1a1d92c1 27-Sep-2006 Alexey Dobriyan <adobriyan@gmail.com>

[PATCH] Really ignore kmem_cache_destroy return value

* Rougly half of callers already do it by not checking return value
* Code in drivers/acpi/osl.c does the following to be sure:

(void)kmem_cache_destroy(cache);

* Those who check it printk something, however, slab_error already printed
the name of failed cache.
* XFS BUGs on failed kmem_cache_destroy which is not the decision
low-level filesystem driver should make. Converted to ignore.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 8014793b 31-Aug-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: rpc_delay() should not clobber the rpc_task->tk_status

Doing so prevents stuff like call_encode() from working correctly.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 39d7bbcb 22-Aug-2006 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: remove extraneous header inclusions

include/linux/sunrpc/clnt.h already includes include/linux/sunrpc/xprt.h.
We can remove xprt.h from source files that already include clnt.h.
Likewise include/linux/sunrpc/timer.h.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 5b1eacbc 22-Aug-2006 Chuck Lever <chuck.lever@oracle.com>

SUNRPC: Support for RPC child tasks no longer needed

The previous patches removed the last user of RPC child tasks, so we can
remove support for child tasks from net/sunrpc/sched.c now.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 93d2341c 26-Mar-2006 Matthew Dobson <colpatch@us.ibm.com>

[PATCH] mempool: use mempool_create_slab_pool()

Modify well over a dozen mempool users to call mempool_create_slab_pool()
rather than calling mempool_create() with extra arguments, saving about 30
lines of code and increasing readability.

Signed-off-by: Matthew Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 4a3e2f71 20-Mar-2006 Arjan van de Ven <arjan@infradead.org>

[NET] sem2mutex: net/

Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7a1218a2 20-Mar-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release()

Currently this will not happen if we exit before rpc_new_task() was called.
Also fix up rpc_run_task() to do the same (for consistency).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# ef759a2e 20-Mar-2006 Chuck Lever <cel@netapp.com>

SUNRPC: introduce per-task RPC iostats

Account for various things that occur while an RPC task is executed.
Separate timers for RPC round trip and RPC execution time show how
long RPC requests wait in queue before being sent. Eventually these
will be accumulated at xprt_release time in one place where they can
be viewed from userland.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# e19b63da 20-Mar-2006 Chuck Lever <cel@netapp.com>

SUNRPC: track length of RPC wait queues

RPC wait queue length will eventually be exported to userland via the RPC
iostats interface.

Test plan:
Compile kernel with CONFIG_NFS enabled.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 24c5d9d7 20-Mar-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Run rpci->queue_timeout on the rpciod workqueue instead of generic

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# e6d83d55 13-Mar-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

[PATCH] SUNRPC: Fix potential deadlock in RPC code

In rpc_wake_up() and rpc_wake_up_status(), it is possible for the call to
__rpc_wake_up_task() to fail if another thread happens to be calling
rpc_wake_up_task() on the same rpc_task.

Problem noticed by Bruno Faccini.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 99acf044 01-Feb-2006 Martin Waitz <tali@admingilde.org>

[PATCH] DocBook: fix some kernel-doc comments in net/sunrpc

Fix the syntax of some kernel-doc comments

Signed-off-by: Martin Waitz <tali@admingilde.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 02107148 03-Jan-2006 Chuck Lever <cel@netapp.com>

SUNRPC: switchable buffer allocation

Add RPC client transport switch support for replacing buffer management
on a per-transport basis.

In the current IPv4 socket transport implementation, RPC buffers are
allocated as needed for each RPC message that is sent. Some transport
implementations may choose to use pre-allocated buffers for encoding,
sending, receiving, and unmarshalling RPC messages, however. For
transports capable of direct data placement, the buffers can be carved
out of a pre-registered area of memory rather than from a slab cache.

Test-plan:
Millions of fsx operations. Performance characterization with "sio" and
"iozone". Use oprofile and other tools to look for significant regression
in CPU utilization.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# e60859ac 03-Jan-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: rpc_execute should not return task->tk_status;

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 44c28873 03-Jan-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

NFSv4: stateful NFSv4 RPC call interface

The NFSv4 model requires us to complete all RPC calls that might
establish state on the server whether or not the user wants to
interrupt it. We may also need to schedule new work (including
new RPC calls) in order to cancel the new state.

The asynchronous RPC model will allow us to ensure that RPC calls
always complete, but in order to allow for "synchronous" RPC, we
want to add the ability to wait for completion.
The waits are, of course, interruptible.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 4ce70ada 03-Jan-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Further cleanups

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 963d8fe5 03-Jan-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

RPC: Clean up RPC task structure

Shrink the RPC task structure. Instead of storing separate pointers
for task->tk_exit and task->tk_release, put them in a structure.

Also pass the user data pointer as a parameter instead of passing it via
task->tk_calldata. This enables us to nest callbacks.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# abbcf28f 03-Jan-2006 Trond Myklebust <Trond.Myklebust@netapp.com>

SUNRPC: Yet more RPC cleanups

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# dd0fc66f 07-Oct-2005 Al Viro <viro@ftp.linux.org.uk>

[PATCH] gfp flags annotations - part 1

- added typedef unsigned int __nocast gfp_t;

- replaced __nocast uses for gfp flags with gfp_t - it gives exactly
the same warnings as far as sparse is concerned, doesn't change
generated code (from gcc point of view we replaced unsigned int with
typedef) and documents what's going on far better.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# dd13a285 04-Oct-2005 Randy Dunlap <rdunlap@infradead.org>

[RPC]: fix sparse gfp nocast warnings

Fix nocast sparse warnings:
net/rxrpc/call.c:2013:25: warning: implicit cast to nocast type
net/rxrpc/connection.c:538:46: warning: implicit cast to nocast type
net/sunrpc/sched.c:730:36: warning: implicit cast to nocast type
net/sunrpc/sched.c:734:56: warning: implicit cast to nocast type

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ba89966c 26-Aug-2005 Eric Dumazet <dada1@cosmosbay.com>

[NET]: use __read_mostly on kmem_cache_t , DEFINE_SNMP_STAT pointers

This patch puts mostly read only data in the right section
(read_mostly), to help sharing of these data between CPUS without
memory ping pongs.

On one of my production machine, tcp_statistics was sitting in a
heavily modified cache line, so *every* SNMP update had to force a
reload.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 96651ab3 22-Jun-2005 Trond Myklebust <Trond.Myklebust@netapp.com>

[PATCH] RPC: Shrink struct rpc_task by switching to wait_on_bit()

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# d05fdb0c 22-Jun-2005 Trond Myklebust <Trond.Myklebust@netapp.com>

[PATCH] RPC: Fix a race with rpc_restart_call()

If the task->tk_exit() wants to restart the RPC call after delaying
then the current RPC code will clobber the timer by calling
rpc_delete_timer() immediately after re-entering the loop in
__rpc_execute().

Problem noticed by Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>


# 1da177e4 16-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org>

Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!