#
15d39883 |
|
11-Sep-2023 |
NeilBrown <neilb@suse.de> |
SUNRPC: change the back-channel queue to lwq This removes the need to store and update back-links in the list. It also remove the need for the _bh version of spin_lock(). Signed-off-by: NeilBrown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
063ab935 |
|
11-Sep-2023 |
NeilBrown <neilb@suse.de> |
SUNRPC: integrate back-channel processing with svc_recv() Using svc_recv() for (NFSv4.1) back-channel handling means we have just one mechanism for waking threads. Also change kthread_freezable_should_stop() in nfs4_callback_svc() to kthread_should_stop() as used elsewhere. kthread_freezable_should_stop() effectively adds a try_to_freeze() call, and svc_recv() already contains that at an appropriate place. Signed-off-by: NeilBrown <neilb@suse.de> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
#
3b50cc1c |
|
23-Sep-2022 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Clean up synopsis of rpcrdma_req_create() Commit 1769e6a816df ("xprtrdma: Clean up rpcrdma_create_req()") added rpcrdma_req_create() with a GFP flags argument in case a caller might want to avoid waiting for memory. There has never been a caller that does not pass GFP_KERNEL as the third argument. That argument can therefore be eliminated. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
c0f26167 |
|
12-Jan-2022 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Remove definitions of RPCDBG_FACILITY Deprecated. dprintk is no longer used in xprtrdma. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
8d863b1f |
|
02-Aug-2021 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Eliminate rpcrdma_post_sends() Clean up. Now that there is only one registration mode, there is only one target "post_send" method: frwr_send(). rpcrdma_post_sends() no longer adds much value, especially since all of its call sites ignore the return code value except to check if it's non-zero. Just have them call frwr_send() directly instead. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
c35ca60d |
|
19-Apr-2021 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Delete rpcrdma_recv_buffer_put() Clean up: The name recv_buffer_put() is a vestige of older code, and the function is just a wrapper for the newer rpcrdma_rep_put(). In most of the existing call sites, a pointer to the owning rpcrdma_buffer is already available. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
84dff5eb |
|
04-Feb-2021 |
Chuck Lever <chuck.lever@oracle.com> |
rpcrdma: Fix comments about reverse-direction operation During the final stages of publication of RFC 8167, reviewers requested that we use the term "reverse direction" rather than "backwards direction". Update comments to reflect this preference. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Tom Talpey <tom@talpey.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
d11e9346 |
|
09-Nov-2020 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Clean up xprtrdma callback tracepoints - Replace displayed kernel memory addresses - Tie the XID and event with the peer's IP address Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
e28ce900 |
|
21-Feb-2020 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: kmalloc rpcrdma_ep separate from rpcrdma_xprt Change the rpcrdma_xprt_disconnect() function so that it no longer waits for the DISCONNECTED event. This prevents blocking if the remote is unresponsive. In rpcrdma_xprt_disconnect(), the transport's rpcrdma_ep is detached. Upon return from rpcrdma_xprt_disconnect(), the transport (r_xprt) is ready immediately for a new connection. The RDMA_CM_DEVICE_REMOVAL and RDMA_CM_DISCONNECTED events are now handled almost identically. However, because the lifetimes of rpcrdma_xprt structures and rpcrdma_ep structures are now independent, creating an rpcrdma_ep needs to take a module ref count. The ep now owns most of the hardware resources for a transport. Also, a kref is needed to ensure that rpcrdma_ep sticks around long enough for the cm_event_handler to finish. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
93aa8e0a |
|
21-Feb-2020 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Merge struct rpcrdma_ia into struct rpcrdma_ep I eventually want to allocate rpcrdma_ep separately from struct rpcrdma_xprt so that on occasion there can be more than one ep per xprt. The new struct rpcrdma_ep will contain all the fields currently in rpcrdma_ia and in rpcrdma_ep. This is all the device and CM settings for the connection, in addition to per-connection settings negotiated with the remote. Take this opportunity to rename the existing ep fields from rep_* to re_* to disambiguate these from struct rpcrdma_rep. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
97d0de88 |
|
21-Feb-2020 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Clean up the post_send path Clean up: Simplify the synopses of functions in the post_send path by combining the struct rpcrdma_ia and struct rpcrdma_ep arguments. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
b78de1dc |
|
03-Jan-2020 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Allocate and map transport header buffers at connect time Currently the underlying RDMA device is chosen at transport set-up time. But it will soon be at connect time instead. The maximum size of a transport header is based on device capabilities. Thus transport header buffers have to be allocated _after_ the underlying device has been chosen (via address and route resolution); ie, in the connect worker. Thus, move the allocation of transport header buffers to the connect worker, after the point at which the underlying RDMA device has been chosen. This also means the RDMA device is available to do a DMA mapping of these buffers at connect time, instead of in the hot I/O path. Make that optimization as well. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
9edb455e |
|
17-Oct-2019 |
Trond Myklebust <trondmy@gmail.com> |
SUNRPC: The RDMA back channel mustn't disappear while requests are outstanding If there are RDMA back channel requests being processed by the server threads, then we should hold a reference to the transport to ensure it doesn't get freed from underneath us. Reported-by: Neil Brown <neilb@suse.de> Fixes: 63cae47005af ("xprtrdma: Handle incoming backward direction RPC calls") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
614f3c96 |
|
17-Oct-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Pull up sometimes On some platforms, DMA mapping part of a page is more costly than copying bytes. Restore the pull-up code and use that when we think it's going to be faster. The heuristic for now is to pull-up when the size of the RPC message body fits in the buffer underlying the head iovec. Indeed, not involving the I/O MMU can help the RPC/RDMA transport scale better for tiny I/Os across more RDMA devices. This is because interaction with the I/O MMU is eliminated, as is handling a Send completion, for each of these small I/Os. Without the explicit unmapping, the NIC no longer needs to do a costly internal TLB shoot down for buffers that are just a handful of bytes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
17d47f93 |
|
19-Aug-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Fix bc_max_slots return value For the moment the returned value just happens to be correct because the current backchannel server implementation does not vary the number of credits it offers. The spec does permit this value to change during the lifetime of a connection, however. The actual maximum is fixed for all RPC/RDMA transports, because each transport instance has to pre-allocate the resources for processing BC requests. That's the value that should be returned. Fixes: 7402a4fedc2b ("SUNRPC: Fix up backchannel slot table ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
7402a4fe |
|
16-Jul-2019 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
SUNRPC: Fix up backchannel slot table accounting Add a per-transport maximum limit in the socket case, and add helpers to allow the NFSv4 code to discover that limit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
94087e97 |
|
24-Apr-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Aggregate the inline settings in struct rpcrdma_ep Clean up. The inline settings are actually a characteristic of the endpoint, and not related to the device. They are also modified after the transport instance is created, so they do not belong in the cdata structure either. Lastly, let's use names that are more natural to RDMA than to NFS: inline_write -> inline_send and inline_read -> inline_recv. The /proc files retain their names to avoid breaking user space. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
3f9c7e76 |
|
24-Apr-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Backchannel can use GFP_KERNEL allocations The Receive handler runs in process context, thus can use on-demand GFP_KERNEL allocations instead of pre-allocation. This makes the xprtrdma backchannel independent of the number of backchannel session slots provisioned by the Upper Layer protocol. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
bb93a1ae |
|
24-Apr-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Allocate req's regbufs at xprt create time Allocating an rpcrdma_req's regbufs at xprt create time enables a pair of micro-optimizations: First, if these regbufs are always there, we can eliminate two conditional branches from the hot xprt_rdma_allocate path. Second, by allocating a 1KB buffer, it places a lower bound on the size of these buffers, without adding yet another conditional branch. The lower bound reduces the number of hardway re- allocations. In fact, for some workloads it completely eliminates hardway allocations. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
8cec3dba |
|
24-Apr-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: rpcrdma_regbuf alignment Allocate the struct rpcrdma_regbuf separately from the I/O buffer to better guarantee the alignment of the I/O buffer and eliminate the wasted space between the rpcrdma_regbuf metadata and the buffer itself. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
1769e6a8 |
|
24-Apr-2019 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Clean up rpcrdma_create_req() Eventually, I'd like to invoke rpcrdma_create_req() during the call_reserve step. Memory allocation there probably needs to use GFP_NOIO. Therefore a set of GFP flags needs to be passed in. As an additional clean up, just return a pointer or NULL, because the only error return code here is -ENOMEM. Lastly, clean up the function names to be consistent with the pattern: "rpcrdma" _ object-type _ action Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
b9779a54 |
|
02-Jan-2019 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
SUNRPC: Ensure rq_bytes_sent is reset before request transmission When we resend a request, ensure that the 'rq_bytes_sent' is reset to zero. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
0ccc61b1 |
|
11-Feb-2019 |
Chuck Lever <chuck.lever@oracle.com> |
SUNRPC: Add xdr_stream::rqst field Having access to the controlling rpc_rqst means a trace point in the XDR code can report: - the XID - the task ID and client ID - the p_name of RPC being processed Subsequent patches will introduce such trace points. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
ddbb347f |
|
19-Dec-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Cull dprintk() call sites Clean up: Remove dprintk() call sites that report rare or impossible errors. Leave a few that display high-value low noise status information. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
92f4433e |
|
19-Dec-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Simplify locking that protects the rl_allreqs list Clean up: There's little chance of contention between the use of rb_lock and rb_reqslock, so merge the two. This avoids having to take both in some (possibly future) cases. Transport tear-down is already serialized, thus there is no need for locking at all when destroying rpcrdma_reqs. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
889ee07f |
|
19-Dec-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Remove request_module from backchannel Since commit ffe1f0df5862 ("rpcrdma: Merge svcrdma and xprtrdma modules into one"), the forward and backchannel components are part of the same kernel module. A separate request_module() call in the backchannel code is no longer necessary. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
0c0829bc |
|
19-Dec-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Don't wake pending tasks until disconnect is done Transport disconnect processing does a "wake pending tasks" at various points. Suppose an RPC Reply is being processed. The RPC task that Reply goes with is waiting on the pending queue. If a disconnect wake-up happens before reply processing is done, that reply, even if it is good, is thrown away, and the RPC has to be sent again. This window apparently does not exist for socket transports because there is a lock held while a reply is being received which prevents the wake-up call until after reply processing is done. To resolve this, all RPC replies being processed on an RPC-over-RDMA transport have to complete before pending tasks are awoken due to a transport disconnect. Callers that already hold the transport write lock may invoke ->ops->close directly. Others use a generic helper that schedules a close when the write lock can be taken safely. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
6ceea368 |
|
19-Dec-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Refactor Receive accounting Clean up: Divide the work cleanly: - rpcrdma_wc_receive is responsible only for RDMA Receives - rpcrdma_reply_handler is responsible only for RPC Replies - the posted send and receive counts both belong in rpcrdma_ep Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
4aa5cffe |
|
24-Dec-2018 |
Vasily Averin <vvs@virtuozzo.com> |
sunrpc: remove unused bc_up operation from rpc_xprt_ops Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
f7d46681 |
|
01-Oct-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Don't disable BH's in backchannel server Clean up: This code was copied from xprtsock.c and backchannel_rqst.c. For rpcrdma, the backchannel server runs exclusively in process context, thus disabling bottom-halves is unnecessary. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
75891f50 |
|
03-Sep-2018 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
SUNRPC: Support for congestion control when queuing is enabled Both RDMA and UDP transports require the request to get a "congestion control" credit before they can be transmitted. Right now, this is done when the request locks the socket. We'd like it to happen when a request attempts to be transmitted for the first time. In order to support retransmission of requests that already hold such credits, we also want to ensure that they get queued first, so that we don't deadlock with requests that have yet to obtain a credit. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
edc81dcd |
|
22-Aug-2018 |
Trond Myklebust <trond.myklebust@hammerspace.com> |
SUNRPC: Refactor xprt_transmit() to remove the reply queue code Separate out the action of adding a request to the reply queue so that the backchannel code can simply skip calling it altogether. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
#
bd2abef3 |
|
07-May-2018 |
Chuck Lever <chuck.lever@oracle.com> |
svcrdma: Trace key RDMA API events This includes: * Posting on the Send and Receive queues * Send, Receive, Read, and Write completion * Connect upcalls * QP errors Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
b6e717cb |
|
07-May-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Prepare RPC/RDMA includes for server-side trace points Clean up: Move #include <trace/events/rpcrdma.h> into source files, similar to how it is done with trace/events/sunrpc.h. Server-side trace points will be part of the rpcrdma subsystem, just like the client-side trace points. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
#
7c8d9e7c |
|
04-May-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Move Receive posting to Receive handler Receive completion and Reply handling are done by a BOUND workqueue, meaning they run on only one CPU. Posting receives is currently done in the send_request path, which on large systems is typically done on a different CPU than the one handling Receive completions. This results in movement of Receive-related cachelines between the sending and receiving CPUs. More importantly, it means that currently Receives are posted while the transport's write lock is held, which is unnecessary and costly. Finally, allocation of Receive buffers is performed on-demand in the Receive completion handler. This helps guarantee that they are allocated on the same NUMA node as the CPU that handles Receive completions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
edb41e61 |
|
04-May-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Make rpc_rqst part of rpcrdma_req This simplifies allocation of the generic RPC slot and xprtrdma specific per-RPC resources. It also makes xprtrdma more like the socket-based transports: ->buf_alloc and ->buf_free are now responsible only for send and receive buffers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
2dd4a012 |
|
28-Feb-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Move creation of rl_rdmabuf to rpcrdma_create_req Refactor: Both rpcrdma_create_req call sites have to allocate the buffer where the transport header is built, so just move that allocation into rpcrdma_create_req. This buffer is a fixed size. There's no needed information available in call_allocate that is not also available when the transport is created. The original purpose for allocating these buffers on demand was to reduce the possibility that an allocation failure during transport creation will hork the mount operation during low memory scenarios. Some relief for this rare possibility is coming up in the next few patches. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
9ab6d89e |
|
03-Jan-2018 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Correct some documenting comments Fix kernel-doc warnings in net/sunrpc/xprtrdma/ . net/sunrpc/xprtrdma/verbs.c:1575: warning: No description found for parameter 'count' net/sunrpc/xprtrdma/verbs.c:1575: warning: Excess function parameter 'min_reqs' description in 'rpcrdma_ep_post_extra_recv' net/sunrpc/xprtrdma/backchannel.c:288: warning: No description found for parameter 'r_xprt' net/sunrpc/xprtrdma/backchannel.c:288: warning: Excess function parameter 'xprt' description in 'rpcrdma_bc_receive_call' Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
fc1eb807 |
|
20-Dec-2017 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Add trace points in the client-side backchannel code paths Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
30b5416b |
|
14-Dec-2017 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Don't clear RPC_BC_PA_IN_USE on pre-allocated rpc_rqst's No need for the overhead of atomically setting and clearing this bit flag for every use of a pre-allocated backchannel rpc_rqst. These are a distinct pool of rpc_rqsts that are used only for callback operations, so it is safe to simply leave the bit set. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
cf73daf5 |
|
14-Dec-2017 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Split xprt_rdma_send_request Clean up. @rqst is set up differently for backchannel Replies. For example, rqst->rq_task and task->tk_client are both NULL. So it is easier to understand and maintain this code path if it is separated. Also, we can get rid of the confusing rl_connect_cookie hack in rpcrdma_bc_receive_call. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
6c537f2c |
|
14-Dec-2017 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: buf_free not called for CB replies Since commit 5a6d1db45569 ("SUNRPC: Add a transport-specific private field in rpc_rqst"), the rpc_rqst's for RPC-over-RDMA backchannel operations leave rq_buffer set to NULL. xprt_release does not invoke ->op->buf_free when rq_buffer is NULL. The RPCRDMA_REQ_F_BACKCHANNEL check in xprt_rdma_free is therefore redundant because xprt_rdma_free is not invoked for backchannel requests. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
d698c4a0 |
|
14-Dec-2017 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Fix backchannel allocation of extra rpcrdma_reps The backchannel code uses rpcrdma_recv_buffer_put to add new reps to the free rep list. This also decrements rb_recv_count, which spoofs the receive overrun logic in rpcrdma_buffer_get_rep. Commit 9b06688bc3b9 ("xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock)") replaced the original open-coded list_add with a call to rpcrdma_recv_buffer_put(), but then a year later, commit 05c974669ece ("xprtrdma: Fix receive buffer accounting") added rep accounting to rpcrdma_recv_buffer_put. It was an oversight to let the backchannel continue to use this function. The fix this, let's combine the "add to free list" logic with rpcrdma_create_rep. Also, do not allocate RPCRDMA_MAX_BC_REQUESTS rpcrdma_reps in rpcrdma_buffer_create and then allocate additional rpcrdma_reps in rpcrdma_bc_setup_reps. Allocating the extra reps during backchannel set-up is sufficient. Fixes: 05c974669ece ("xprtrdma: Fix receive buffer accounting") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
531cca0c |
|
20-Oct-2017 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Add a field of bit flags to struct rpcrdma_req We have one boolean flag in rpcrdma_req today. I'd like to add more flags, so convert that boolean to a bit flag. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
857f9aca |
|
20-Oct-2017 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Change return value of rpcrdma_prepare_send_sges() Clean up: Make rpcrdma_prepare_send_sges() return a negative errno instead of a bool. Soon callers will want distinct treatments of different types of failures. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
b2441318 |
|
01-Nov-2017 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
7ec910e7 |
|
09-Aug-2017 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Clean up rpcrdma_bc_marshal_reply() Same changes as in rpcrdma_marshal_req(). This removes C-structure style encoding from the backchannel. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
41c8f70f |
|
03-Aug-2017 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Harden backchannel call decoding Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
d2c23c00 |
|
22-May-2017 |
Markus Elfring <elfring@users.sourceforge.net> |
xprtrdma: Delete an error message for a failed memory allocation in xprt_rdma_bc_setup() Omit an extra message for a memory allocation failure in this function. This issue was detected by using the Coccinelle software. Link: http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
|
#
62aee0e3 |
|
29-Nov-2016 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Cap size of callback buffer resources When the inline threshold size is set to large values (say, 32KB) any NFSv4.1 CB request from the server gets a reply with status NFS4ERR_RESOURCE. Looks like there are some upper layer assumptions about the maximum size of a reply (for example, in process_op). Cap the size of the NFSv4 client's reply resources at a page. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
655fec69 |
|
15-Sep-2016 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Use gathered Send for large inline messages An RPC Call message that is sent inline but that has a data payload (ie, one or more items in rq_snd_buf's page list) must be "pulled up:" - call_allocate has to reserve enough RPC Call buffer space to accommodate the data payload - call_transmit has to memcopy the rq_snd_buf's page list and tail into its head iovec before it is sent As the inline threshold is increased beyond its current 1KB default, however, this means data payloads of more than a few KB are copied by the host CPU. For example, if the inline threshold is increased just to 4KB, then NFS WRITE requests up to 4KB would involve a memcpy of the NFS WRITE's payload data into the RPC Call buffer. This is an undesirable amount of participation by the host CPU. The inline threshold may be much larger than 4KB in the future, after negotiation with a peer server. Instead of copying the components of rq_snd_buf into its head iovec, construct a gather list of these components, and send them all in place. The same approach is already used in the Linux server's RPC-over-RDMA reply path. This mechanism also eliminates the need for rpcrdma_tail_pullup, which is used to manage the XDR pad and trailing inline content when a Read list is present. This requires that the pages in rq_snd_buf's page list be DMA-mapped during marshaling, and unmapped when a data-bearing RPC is completed. This is slightly less efficient for very small I/O payloads, but significantly more efficient as data payload size and inline threshold increase past a kilobyte. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
90aab602 |
|
15-Sep-2016 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Move send_wr to struct rpcrdma_req Clean up: Most of the fields in each send_wr do not vary. There is no need to initialize them before each ib_post_send(). This removes a large-ish data structure from the stack. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
b157380a |
|
15-Sep-2016 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Simplify rpcrdma_ep_post_recv() Clean up. Since commit fc66448549bb ("xprtrdma: Split the completion queue"), rpcrdma_ep_post_recv() no longer uses the "ep" argument. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
13650c23 |
|
15-Sep-2016 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Eliminate "ia" argument in rpcrdma_{alloc, free}_regbuf Clean up. The "ia" argument is no longer used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
54cbd6b0 |
|
15-Sep-2016 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Delay DMA mapping Send and Receive buffers Currently, each regbuf is allocated and DMA mapped at the same time. This is done during transport creation. When a device driver is unloaded, every DMA-mapped buffer in use by a transport has to be unmapped, and then remapped to the new device if the driver is loaded again. Remapping will have to be done _after_ the connect worker has set up the new device. But there's an ordering problem: call_allocate, which invokes xprt_rdma_allocate which calls rpcrdma_alloc_regbuf to allocate Send buffers, happens _before_ the connect worker can run to set up the new device. Instead, at transport creation, allocate each buffer, but leave it unmapped. Once the RPC carries these buffers into ->send_request, by which time a transport connection should have been established, check to see that the RPC's buffers have been DMA mapped. If not, map them there. When device driver unplug support is added, it will simply unmap all the transport's regbufs, but it doesn't have to deallocate the underlying memory. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
99ef4db3 |
|
15-Sep-2016 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Replace DMA_BIDIRECTIONAL The use of DMA_BIDIRECTIONAL is discouraged by DMA-API.txt. Fortunately, xprtrdma now knows which direction I/O is going as soon as it allocates each regbuf. The RPC Call and Reply buffers are no longer the same regbuf. They can each be labeled correctly now. The RPC Reply buffer is never part of either a Send or Receive WR, but it can be part of Reply chunk, which is mapped and registered via ->ro_map . So it is not DMA mapped when it is allocated (DMA_NONE), to avoid a double- mapping. Since Receive buffers are no longer DMA_BIDIRECTIONAL and their contents are never modified by the host CPU, DMA-API-HOWTO.txt suggests that a DMA sync before posting each buffer should be unnecessary. (See my_card_interrupt_handler). Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
08cf2efd |
|
15-Sep-2016 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Use smaller buffers for RPC-over-RDMA headers Commit 949317464bc2 ("xprtrdma: Limit number of RDMA segments in RPC-over-RDMA headers") capped the number of chunks that may appear in RPC-over-RDMA headers. The maximum header size can be estimated and fixed to avoid allocating buffer space that is never used. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
5a6d1db4 |
|
15-Sep-2016 |
Chuck Lever <chuck.lever@oracle.com> |
SUNRPC: Add a transport-specific private field in rpc_rqst Currently there's a hidden and indirect mechanism for finding the rpcrdma_req that goes with an rpc_rqst. It depends on getting from the rq_buffer pointer in struct rpc_rqst to the struct rpcrdma_regbuf that controls that buffer, and then to the struct rpcrdma_req it goes with. This was done back in the day to avoid the need to add a per-rqst pointer or to alter the buf_free API when support for RPC-over-RDMA was introduced. I'm about to change the way regbuf's work to support larger inline thresholds. Now is a good time to replace this indirect mechanism with something that is more straightforward. I guess this should be considered a clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
b9c5bc03 |
|
15-Sep-2016 |
Chuck Lever <chuck.lever@oracle.com> |
SUNRPC: Refactor rpc_xdr_buf_init() Clean up: there is some XDR initialization logic that is common to the forward channel and backchannel. Move it to an XDR header so it can be shared. rpc_rqst::rq_buffer points to a buffer containing big-endian data. Update its annotation as part of the clean up. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
eb342e9a |
|
15-Sep-2016 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Eliminate INLINE_THRESHOLD macros Clean up: r_xprt is already available everywhere these macros are invoked, so just dereference that directly. RPCRDMA_INLINE_PAD_VALUE is no longer used, so it can simply be removed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
6b26cc8c |
|
02-May-2016 |
Chuck Lever <chuck.lever@oracle.com> |
sunrpc: Advertise maximum backchannel payload size RPC-over-RDMA transports have a limit on how large a backward direction (backchannel) RPC message can be. Ensure that the NFSv4.x CREATE_SESSION operation advertises this limit to servers. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
9f74660b |
|
15-Feb-2016 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: rpcrdma_bc_receive_call() should init rq_private_buf.len Some NFSv4.1 OPEN requests were hanging waiting for the NFS server to finish recalling delegations. Turns out that each NFSv4.1 CB request on RDMA gets a GARBAGE_ARGS reply from the Linux client. Commit 756b9b37cfb2e3dc added a line in bc_svc_process that overwrites the incoming rq_rcv_buf's length with the value in rq_private_buf.len. But rpcrdma_bc_receive_call() does not invoke xprt_complete_bc_request(), thus rq_private_buf.len is not initialized. svc_process_common() is invoked with a zero-length RPC message, and fails. Fixes: 756b9b37cfb2e3dc ('SUNRPC: Fix callback channel') Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
c8bbe0c7 |
|
16-Dec-2015 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Disable RPC/RDMA backchannel debugging messages Clean up. Fixes: 63cae47005af ('xprtrdma: Handle incoming backward direction') Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
9b06688b |
|
16-Dec-2015 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock) Clean up. rb_lock critical sections added in rpcrdma_ep_post_extra_recv() should have first been converted to use normal spin_lock now that the reply handler is a work queue. The backchannel set up code should use the appropriate helper instead of open-coding a rb_recv_bufs list add. Problem introduced by glib patch re-ordering on my part. Fixes: f531a5dbc451 ('xprtrdma: Pre-allocate backward rpc_rqst') Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
abfb6897 |
|
05-Nov-2015 |
Dan Carpenter <dan.carpenter@oracle.com> |
xprtrdma: checking for NULL instead of IS_ERR() The rpcrdma_create_req() function returns error pointers or success. It never returns NULL. Fixes: f531a5dbc451 ('xprtrdma: Pre-allocate backward rpc_rqst and send/receive buffers') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
76566773 |
|
24-Oct-2015 |
Chuck Lever <chuck.lever@oracle.com> |
NFS: Enable client side NFSv4.1 backchannel to use other transports Forechannel transports get their own "bc_up" method to create an endpoint for the backchannel service. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [Anna Schumaker: Add forward declaration of struct net to xprt.h] Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
63cae470 |
|
24-Oct-2015 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Handle incoming backward direction RPC calls Introduce a code path in the rpcrdma_reply_handler() to catch incoming backward direction RPC calls and route them to the ULP's backchannel server. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
83128a60 |
|
24-Oct-2015 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Add support for sending backward direction RPC replies Backward direction RPC replies are sent via the client transport's send_request method, the same way forward direction RPC calls are sent. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
124fa17d |
|
24-Oct-2015 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Pre-allocate Work Requests for backchannel Pre-allocate extra send and receive Work Requests needed to handle backchannel receives and sends. The transport doesn't know how many extra WRs to pre-allocate until the xprt_setup_backchannel() call, but that's long after the WRs are allocated during forechannel setup. So, use a fixed value for now. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
#
f531a5db |
|
24-Oct-2015 |
Chuck Lever <chuck.lever@oracle.com> |
xprtrdma: Pre-allocate backward rpc_rqst and send/receive buffers xprtrdma's backward direction send and receive buffers are the same size as the forechannel's inline threshold, and must be pre- registered. The consumer has no control over which receive buffer the adapter chooses to catch an incoming backwards-direction call. Any receive buffer can be used for either a forward reply or a backward call. Thus both types of RPC message must all be the same size. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Tested-By: Devesh Sharma <devesh.sharma@avagotech.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|