History log of /linux-master/net/9p/client.c
Revision Date Author Comments
# be3193e5 08-Jan-2024 Dominique Martinet <asmadeus@codewreck.org>

9p: Fix read/write debug statements to report server reply

Previous conversion to iov missed these debug statements which would now
always print the requested size instead of the actual server reply.

Write also added a loop in a much older commit but we didn't report
these, while reads do report each iteration -- it's more coherent to
keep reporting all requests to server so move that at the same time.

Fixes: 7f02464739da ("9p: convert to advancing variant of iov_iter_get_pages_alloc()")
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Message-ID: <20240109-9p-rw-trace-v1-1-327178114257@codewreck.org>


# ce070879 26-Oct-2023 Hangyu Hua <hbh25y@gmail.com>

9p/net: fix possible memory leak in p9_check_errors()

When p9pdu_readf() is called with "s?d" attribute, it allocates a pointer
that will store a string. But when p9pdu_readf() fails while handling "d"
then this pointer will not be freed in p9_check_errors().

Fixes: 51a87c552dfd ("9p: rework client code to use new protocol support functions")
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Message-ID: <20231027030302.11927-1-hbh25y@gmail.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 9b5c6281 25-Oct-2023 Dominique Martinet <asmadeus@codewreck.org>

9p: v9fs_listxattr: fix %s null argument warning

W=1 warns about null argument to kprintf:
In file included from fs/9p/xattr.c:12:
In function ‘v9fs_xattr_get’,
inlined from ‘v9fs_listxattr’ at fs/9p/xattr.c:142:9:
include/net/9p/9p.h:55:2: error: ‘%s’ directive argument is null
[-Werror=format-overflow=]
55 | _p9_debug(level, __func__, fmt, ##__VA_ARGS__)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Use an empty string instead of :
- this is ok 9p-wise because p9pdu_vwritef serializes a null string
and an empty string the same way (one '0' word for length)
- since this degrades the print statements, add new single quotes for
xattr's name delimter (Old: "file = (null)", new: "file = ''")

Link: https://lore.kernel.org/r/20231008060138.517057-1-suhui@nfschina.com
Suggested-by: Su Hui <suhui@nfschina.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Acked-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Message-ID: <20231025103445.1248103-2-asmadeus@codewreck.org>


# cf7c33d3 03-May-2023 Dominique Martinet <asmadeus@codewreck.org>

9p: remove dead stores (variable set again without being read)

The 9p code for some reason used to initialize variables outside of the
declaration, e.g. instead of just initializing the variable like this:

int retval = 0

We would be doing this:

int retval;
retval = 0;

This is perfectly fine and the compiler will just optimize dead stores
anyway, but scan-build seems to think this is a problem and there are
many of these warnings making the output of scan-build full of such
warnings:
fs/9p/vfs_inode.c:916:2: warning: Value stored to 'retval' is never read [deadcode.DeadStores]
retval = 0;
^ ~

I have no strong opinion here, but if we want to regularly run
scan-build we should fix these just to silence the messages.

I've confirmed these all are indeed ok to remove.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>


# 46c30cb8 18-Dec-2022 Eric Van Hensbergen <ericvh@kernel.org>

9p: Add additional debug flags and open modes

Add some additional debug flags to assist with debugging
cache changes. Also add some additional open modes so we
can track cache state in fids more directly.

Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>


# 3866584a 18-Dec-2022 Eric Van Hensbergen <ericvh@kernel.org>

net/9p: fix bug in client create for .L

We are supposed to set fid->mode to reflect the flags
that were used to open the file. We were actually setting
it to the creation mode which is the default perms of the
file not the flags the file was opened with.

Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>


# 2a034722 27-Nov-2022 Eric Van Hensbergen <ericvh@kernel.org>

net/9p: Adjust maximum MSIZE to account for p9 header

Add maximum p9 header size to MSIZE to make sure we can
have page aligned data.

Signed-off-by: Eric Van Hensbergen <ericvh@kernel.org>
Reviewed-by: Dominique Martinet <asmadeus@codewreck.org>


# 1a4f69ef 05-Dec-2022 Dominique Martinet <asmadeus@codewreck.org>

9p/client: fix data race on req->status

KCSAN reported a race between writing req->status in p9_client_cb and
accessing it in p9_client_rpc's wait_event.

Accesses to req itself is protected by the data barrier (writing req
fields, write barrier, writing status // reading status, read barrier,
reading other req fields), but status accesses themselves apparently
also must be annotated properly with WRITE_ONCE/READ_ONCE when we
access it without locks.

Follows:
- error paths writing status in various threads all can notify
p9_client_rpc, so these all also need WRITE_ONCE
- there's a similar read loop in trans_virtio for zc case that also
needs READ_ONCE
- other reads in trans_fd should be protected by the trans_fd lock and
lists state machine, as corresponding writers all are within trans_fd
and should be under the same lock. If KCSAN complains on them we likely
will have something else to fix as well, so it's better to leave them
unmarked and look again if required.

Link: https://lkml.kernel.org/r/20221205124756.426350-1-asmadeus@codewreck.org
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Suggested-by: Marco Elver <elver@google.com>
Acked-by: Marco Elver <elver@google.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# a31b3cff 22-Nov-2022 Christian Schoenebeck <linux_oss@crudebyte.com>

net/9p: fix response size check in p9_check_errors()

Since commit 60ece0833b6c ("net/9p: allocate appropriate reduced message
buffers") it is no longer appropriate to check server's response size
against msize. Check against the previously allocated buffer capacity
instead.

- Omit this size check entirely for zero-copy messages, as those always
allocate 4k (P9_ZC_HDR_SZ) linear buffers which are not used for actual
payload and can be much bigger than 4k.

- Replace p9_debug() by pr_err() to make sure this message is always
printed in case this error is triggered.

- Add 9p message type to error message to ease investigation.

Link: https://lkml.kernel.org/r/e0edec84b1c80119ae937ce854b4f5f6dbe2d08c.1669144861.git.linux_oss@crudebyte.com
Signed-off-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 8e4c2eee 22-Nov-2022 Christian Schoenebeck <linux_oss@crudebyte.com>

net/9p: distinguish zero-copy requests

Add boolean `zc` member to struct p9_fcall to distinguish zero-copy
messages (not using the linear `sdata` buffer for message payload) from
regular messages (which do copy message payload to `sdata` before being
further processed).

This new member is appended to end of structure to avoid inserting huge
padding in generated layout.

Link: https://lkml.kernel.org/r/8f2a5c12a446c3b544da64e0b1550e1fb2d6f972.1669144861.git.linux_oss@crudebyte.com
Signed-off-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 26273ade 30-Nov-2022 Schspa Shi <schspa@gmail.com>

9p: set req refcount to zero to avoid uninitialized usage

When a new request is allocated, the refcount will be zero if it is
reused, but if the request is newly allocated from slab, it is not fully
initialized before being added to idr.

If the p9_read_work got a response before the refcount initiated. It will
use a uninitialized req, which will result in a bad request data struct.

Here is the logs from syzbot.

Corrupted memory at 0xffff88807eade00b [ 0xff 0x07 0x00 0x00 0x00 0x00
0x00 0x00 . . . . . . . . ] (in kfence-#110):
p9_fcall_fini net/9p/client.c:248 [inline]
p9_req_put net/9p/client.c:396 [inline]
p9_req_put+0x208/0x250 net/9p/client.c:390
p9_client_walk+0x247/0x540 net/9p/client.c:1165
clone_fid fs/9p/fid.h:21 [inline]
v9fs_fid_xattr_set+0xe4/0x2b0 fs/9p/xattr.c:118
v9fs_xattr_set fs/9p/xattr.c:100 [inline]
v9fs_xattr_handler_set+0x6f/0x120 fs/9p/xattr.c:159
__vfs_setxattr+0x119/0x180 fs/xattr.c:182
__vfs_setxattr_noperm+0x129/0x5f0 fs/xattr.c:216
__vfs_setxattr_locked+0x1d3/0x260 fs/xattr.c:277
vfs_setxattr+0x143/0x340 fs/xattr.c:309
setxattr+0x146/0x160 fs/xattr.c:617
path_setxattr+0x197/0x1c0 fs/xattr.c:636
__do_sys_setxattr fs/xattr.c:652 [inline]
__se_sys_setxattr fs/xattr.c:648 [inline]
__ia32_sys_setxattr+0xc0/0x160 fs/xattr.c:648
do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
__do_fast_syscall_32+0x65/0xf0 arch/x86/entry/common.c:178
do_fast_syscall_32+0x33/0x70 arch/x86/entry/common.c:203
entry_SYSENTER_compat_after_hwframe+0x70/0x82

Below is a similar scenario, the scenario in the syzbot log looks more
complicated than this one, but this patch can fix it.

T21124 p9_read_work
======================== second trans =================================
p9_client_walk
p9_client_rpc
p9_client_prepare_req
p9_tag_alloc
req = kmem_cache_alloc(p9_req_cache, GFP_NOFS);
tag = idr_alloc
<< preempted >>
req->tc.tag = tag;
/* req->[refcount/tag] == uninitialized */
m->rreq = p9_tag_lookup(m->client, m->rc.tag);
/* increments uninitalized refcount */

refcount_set(&req->refcount, 2);
/* cb drops one ref */
p9_client_cb(req)
/* reader thread drops its ref:
request is incorrectly freed */
p9_req_put(req)
/* use after free and ref underflow */
p9_req_put(req)

To fix it, we can initialize the refcount to zero before add to idr.

Link: https://lkml.kernel.org/r/20221201033310.18589-1-schspa@gmail.com
Cc: stable@vger.kernel.org # 6.0+ due to 6cda12864cb0 ("9p: Drop kref usage")
Fixes: 728356dedeff ("9p: Add refcount to p9_req_t")
Reported-by: syzbot+8f1060e2aaf8ca55220b@syzkaller.appspotmail.com
Signed-off-by: Schspa Shi <schspa@gmail.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# de4eda9d 15-Sep-2022 Al Viro <viro@zeniv.linux.org.uk>

use less confusing names for iov_iter direction initializers

READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.

Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 60ece083 15-Jul-2022 Christian Schoenebeck <linux_oss@crudebyte.com>

net/9p: allocate appropriate reduced message buffers

So far 'msize' was simply used for all 9p message types, which is far
too much and slowed down performance tremendously with large values
for user configurable 'msize' option.

Let's stop this waste by using the new p9_msg_buf_size() function for
allocating more appropriate, smaller buffers according to what is
actually sent over the wire.

Only exception: RDMA transport is currently excluded from this message
size optimization - for its response buffers that is - as RDMA transport
would not cope with it, due to its response buffers being pulled from a
shared pool. [1]

Link: https://lore.kernel.org/all/Ys3jjg52EIyITPua@codewreck.org/ [1]
Link: https://lkml.kernel.org/r/3f51590535dc96ed0a165b8218c57639cfa5c36c.1657920926.git.linux_oss@crudebyte.com
Signed-off-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# e7c62197 15-Jul-2022 Christian Schoenebeck <linux_oss@crudebyte.com>

net/9p: split message size argument into 't_size' and 'r_size' pair

Refactor 'max_size' argument of p9_tag_alloc() and 'req_size' argument
of p9_client_prepare_req() both into a pair of arguments 't_size' and
'r_size' respectively to allow handling the buffer size for request and
reply separately from each other.

Link: https://lkml.kernel.org/r/9431a25fe4b37fd12cecbd715c13af71f701f220.1657920926.git.linux_oss@crudebyte.com
Signed-off-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 7f024647 10-Jun-2022 Al Viro <viro@zeniv.linux.org.uk>

9p: convert to advancing variant of iov_iter_get_pages_alloc()

that one is somewhat clumsier than usual and needs serious testing.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# aa7aeee1 10-Jul-2022 Tyler Hicks <tyhicks@linux.microsoft.com>

net/9p: Initialize the iounit field during fid creation

Ensure that the fid's iounit field is set to zero when a new fid is
created. Certain 9P operations, such as OPEN and CREATE, allow the
server to reply with an iounit size which the client code assigns to the
p9_fid struct shortly after the fid is created by p9_fid_create(). On
the other hand, an XATTRWALK operation doesn't allow for the server to
specify an iounit value. The iounit field of the newly allocated p9_fid
struct remained uninitialized in that case. Depending on allocation
patterns, the iounit value could have been something reasonable that was
carried over from previously freed fids or, in the worst case, could
have been arbitrary values from non-fid related usages of the memory
location.

The bug was detected in the Windows Subsystem for Linux 2 (WSL2) kernel
after the uninitialized iounit field resulted in the typical sequence of
two getxattr(2) syscalls, one to get the size of an xattr and another
after allocating a sufficiently sized buffer to fit the xattr value, to
hit an unexpected ERANGE error in the second call to getxattr(2). An
uninitialized iounit field would sometimes force rsize to be smaller
than the xattr value size in p9_client_read_once() and the 9P server in
WSL refused to chunk up the READ on the attr_fid and, instead, returned
ERANGE to the client. The virtfs server in QEMU seems happy to chunk up
the READ and this problem goes undetected there.

Link: https://lkml.kernel.org/r/20220710141402.803295-1-tyhicks@linux.microsoft.com
Fixes: ebf46264a004 ("fs/9p: Add support user. xattr")
Cc: stable@vger.kernel.org
Signed-off-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 67dd8e44 12-Jul-2022 Dominique Martinet <asmadeus@codewreck.org>

9p: roll p9_tag_remove into p9_req_put

mempool prep commit removed the awkward kref usage which didn't
allow passing client pointer easily with the ref, so we no longer
need a separate function to remove the tag from idr.

This has the side benefit that it should be more robust in detecting
leaks: umount will now properly catch unfreed requests as they still
will be in the idr until the last ref is dropped

Link: https://lkml.kernel.org/r/20220712060801.2487140-1-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>


# 8b11ff09 03-Jul-2022 Kent Overstreet <kent.overstreet@gmail.com>

9p: Add client parameter to p9_req_put()

This is to aid in adding mempools, in the next patch.

Link: https://lkml.kernel.org/r/20220704014243.153050-2-kent.overstreet@gmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 6cda1286 03-Jul-2022 Kent Overstreet <kent.overstreet@gmail.com>

9p: Drop kref usage

An upcoming patch is going to require passing the client through
p9_req_put() -> p9_req_free(), but that's awkward with the kref
indirection - so this patch switches to using refcount_t directly.

Link: https://lkml.kernel.org/r/20220704014243.153050-1-kent.overstreet@gmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 286c171b 12-Jun-2022 Dominique Martinet <asmadeus@codewreck.org>

9p fid refcount: add a 9p_fid_ref tracepoint

This adds a tracepoint event for 9p fid lifecycle tracing: when a fid
is created, its reference count increased/decreased, and freed.
The new 9p_fid_ref tracepoint should help anyone wishing to debug any
fid problem such as missing clunk (destroy) or use-after-free.

Link: https://lkml.kernel.org/r/20220612085330.1451496-6-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# b48dbb99 11-Jun-2022 Dominique Martinet <asmadeus@codewreck.org>

9p fid refcount: add p9_fid_get/put wrappers

I was recently reminded that it is not clear that p9_client_clunk()
was actually just decrementing refcount and clunking only when that
reaches zero: make it clear through a set of helpers.

This will also allow instrumenting refcounting better for debugging
next patch

Link: https://lkml.kernel.org/r/20220612085330.1451496-5-asmadeus@codewreck.org
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# f615625a 09-Jun-2022 Al Viro <viro@zeniv.linux.org.uk>

9p: handling Rerror without copy_from_iter_full()

p9_client_zc_rpc()/p9_check_zc_errors() are playing fast
and loose with copy_from_iter_full().

Reading from file is done by sending Tread request. Response
consists of fixed-sized header (including the amount of data actually
read) followed by the data itself.

For zero-copy case we arrange the things so that the first
11 bytes of reply go into the fixed-sized buffer, with the rest going
straight into the pages we want to read into.

What makes the things inconvenient is that sglist describing
what should go where has to be set *before* the reply arrives. As
the result, if reply is an error, the things get interesting. On success
we get
size[4] Rread tag[2] count[4] data[count]
For error layout varies depending upon the protocol variant -
in original 9P and 9P2000 it's
size[4] Rerror tag[2] len[2] error[len]
in 9P2000.U
size[4] Rerror tag[2] len[2] error[len] errno[4]
in 9P2000.L
size[4] Rlerror tag[2] errno[4]

The last case is nice and simple - we have an 11-byte response
that fits into the fixed-sized buffer we hoped to get an Rread into.
In other two, though, we get a variable-length string spill into the
pages we'd prepared for the data to be read.

Had that been in fixed-sized buffer (which is actually 4K),
we would've dealt with that the same way we handle non-zerocopy case.
However, for zerocopy it doesn't end up there, so we need to copy it
from those pages.

The trouble is, by the time we get around to that, the
references to pages in question are already dropped. As the result,
p9_zc_check_errors() tries to get the data using copy_from_iter_full().
Unfortunately, the iov_iter it's trying to read from might *NOT* be
capable of that. It is, after all, a data destination, not data source.
In particular, if it's an ITER_PIPE one, copy_from_iter_full() will
simply fail.

In ->zc_request() itself we do have those pages and dealing with
the problem in there would be a simple matter of memcpy_from_page()
into the fixed-sized buffer. Moreover, it isn't hard to recognize
the (rare) case when such copying is needed. That way we get rid of
p9_zc_check_errors() entirely - p9_check_errors() can be used instead
both for zero-copy and non-zero-copy cases.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 15e2721b 30-Dec-2021 Christian Schoenebeck <linux_oss@crudebyte.com>

net/9p: show error message if user 'msize' cannot be satisfied

If user supplied a large value with the 'msize' option, then
client would silently limit that 'msize' value to the maximum
value supported by transport. That's a bit confusing for users
of not having any indication why the preferred 'msize' value
could not be satisfied.

Link: https://lkml.kernel.org/r/783ba37c1566dd715b9a67d437efa3b77e3cd1a7.1640870037.git.linux_oss@crudebyte.com
Reported-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 6e195b0f 02-Nov-2021 Dominique Martinet <asmadeus@codewreck.org>

9p: fix a bunch of checkpatch warnings

Sohaib Mohamed started a serie of tiny and incomplete checkpatch fixes but
seemingly stopped halfway -- take over and do most of it.
This is still missing net/9p/trans* and net/9p/protocol.c for a later
time...

Link: http://lkml.kernel.org/r/20211102134608.1588018-3-dominique.martinet@atmark-techno.com
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 024b7d6a 02-Nov-2021 Dominique Martinet <asmadeus@codewreck.org>

9p: fix file headers

- add missing SPDX-License-Identifier
- remove (sometimes incorrect) file name from file header

Link: http://lkml.kernel.org/r/20211102134608.1588018-2-dominique.martinet@atmark-techno.com
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 27eb4c31 02-Nov-2021 Dominique Martinet <asmadeus@codewreck.org>

9p/net: fix missing error check in p9_check_errors

Link: https://lkml.kernel.org/r/99338965-d36c-886e-cd0e-1d8fff2b4746@gmail.com
Reported-by: syzbot+06472778c97ed94af66d@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 9c4d94dc 04-Sep-2021 Christian Schoenebeck <linux_oss@crudebyte.com>

net/9p: increase default msize to 128k

Let's raise the default msize value to 128k.

The 'msize' option defines the maximum message size allowed for any
message being transmitted (in both directions) between 9p server and 9p
client during a 9p session.

Currently the default 'msize' is just 8k, which is way too conservative.
Such a small 'msize' value has quite a negative performance impact,
because individual 9p messages have to be split up far too often into
numerous smaller messages to fit into this message size limitation.

A default value of just 8k also has a much higher probablity of hitting
short-read issues like: https://gitlab.com/qemu-project/qemu/-/issues/409

Unfortunately user feedback showed that many 9p users are not aware that
this option even exists, nor the negative impact it might have if it is
too low.

Link: http://lkml.kernel.org/r/61ea0f0faaaaf26dd3c762eabe4420306ced21b9.1630770829.git.linux_oss@crudebyte.com
Link: https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg01003.html
Signed-off-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 9210fc0a 04-Sep-2021 Christian Schoenebeck <linux_oss@crudebyte.com>

net/9p: use macro to define default msize

Use a macro to define the default value for the 'msize' option
at one place instead of using two separate integer literals.

Link: http://lkml.kernel.org/r/28bb651ae0349a7d57e8ddc92c1bd5e62924a912.1630770829.git.linux_oss@crudebyte.com
Signed-off-by: Christian Schoenebeck <linux_oss@crudebyte.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 03ff7371 27-Mar-2021 Xiongfeng Wang <wangxiongfeng2@huawei.com>

net: 9p: Correct function names in the kerneldoc comments

Fix the following W=1 kernel build warning(s):

net/9p/client.c:133: warning: expecting prototype for parse_options(). Prototype was for parse_opts() instead
net/9p/client.c:269: warning: expecting prototype for p9_req_alloc(). Prototype was for p9_tag_alloc() instead

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d65614a0 02-Mar-2021 Jisheng Zhang <Jisheng.Zhang@synaptics.com>

net: 9p: advance iov on empty read

I met below warning when cating a small size(about 80bytes) txt file
on 9pfs(msize=2097152 is passed to 9p mount option), the reason is we
miss iov_iter_advance() if the read count is 0 for zerocopy case, so
we didn't truncate the pipe, then iov_iter_pipe() thinks the pipe is
full. Fix it by removing the exception for 0 to ensure to call
iov_iter_advance() even on empty read for zerocopy case.

[ 8.279568] WARNING: CPU: 0 PID: 39 at lib/iov_iter.c:1203 iov_iter_pipe+0x31/0x40
[ 8.280028] Modules linked in:
[ 8.280561] CPU: 0 PID: 39 Comm: cat Not tainted 5.11.0+ #6
[ 8.281260] RIP: 0010:iov_iter_pipe+0x31/0x40
[ 8.281974] Code: 2b 42 54 39 42 5c 76 22 c7 07 20 00 00 00 48 89 57 18 8b 42 50 48 c7 47 08 b
[ 8.283169] RSP: 0018:ffff888000cbbd80 EFLAGS: 00000246
[ 8.283512] RAX: 0000000000000010 RBX: ffff888000117d00 RCX: 0000000000000000
[ 8.283876] RDX: ffff88800031d600 RSI: 0000000000000000 RDI: ffff888000cbbd90
[ 8.284244] RBP: ffff888000cbbe38 R08: 0000000000000000 R09: ffff8880008d2058
[ 8.284605] R10: 0000000000000002 R11: ffff888000375510 R12: 0000000000000050
[ 8.284964] R13: ffff888000cbbe80 R14: 0000000000000050 R15: ffff88800031d600
[ 8.285439] FS: 00007f24fd8af600(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
[ 8.285844] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.286150] CR2: 00007f24fd7d7b90 CR3: 0000000000c97000 CR4: 00000000000406b0
[ 8.286710] Call Trace:
[ 8.288279] generic_file_splice_read+0x31/0x1a0
[ 8.289273] ? do_splice_to+0x2f/0x90
[ 8.289511] splice_direct_to_actor+0xcc/0x220
[ 8.289788] ? pipe_to_sendpage+0xa0/0xa0
[ 8.290052] do_splice_direct+0x8b/0xd0
[ 8.290314] do_sendfile+0x1ad/0x470
[ 8.290576] do_syscall_64+0x2d/0x40
[ 8.290818] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 8.291409] RIP: 0033:0x7f24fd7dca0a
[ 8.292511] Code: c3 0f 1f 80 00 00 00 00 4c 89 d2 4c 89 c6 e9 bd fd ff ff 0f 1f 44 00 00 31 8
[ 8.293360] RSP: 002b:00007ffc20932818 EFLAGS: 00000206 ORIG_RAX: 0000000000000028
[ 8.293800] RAX: ffffffffffffffda RBX: 0000000001000000 RCX: 00007f24fd7dca0a
[ 8.294153] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 0000000000000001
[ 8.294504] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
[ 8.294867] R10: 0000000001000000 R11: 0000000000000206 R12: 0000000000000003
[ 8.295217] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
[ 8.295782] ---[ end trace 63317af81b3ca24b ]---

Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ff5e72eb 03-Nov-2020 Dominique Martinet <asmadeus@codewreck.org>

9p: apply review requests for fid refcounting

Fix style issues in parent commit ("apply review requests for fid
refcounting"), no functional change.

Link: http://lkml.kernel.org/r/1605802012-31133-2-git-send-email-asmadeus@codewreck.org
Fixes: 6636b6dcc3db ("9p: add refcount to p9_fid struct")
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 6636b6dc 23-Sep-2020 Jianyong Wu <jianyong.wu@arm.com>

9p: add refcount to p9_fid struct

Fix race issue in fid contention.

Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall
bug in 9p. But there is race issue in fid parallel accesses.
As Greg's patch stores all of fids from opened files into according inode,
so all the lookup fid ops can retrieve fid from inode preferentially. But
there is no mechanism to handle the fid contention issue. For example,
there are two threads get the same fid in the same time and one of them
clunk the fid before the other thread ready to discard the fid. In this
scenario, it will lead to some fatal problems, even kernel core dump.

I introduce a mechanism to fix this race issue. A counter field introduced
into p9_fid struct to store the reference counter to the fid. When a fid
is allocated from the inode or dentry, the counter will increase, and
will decrease at the end of its occupation. It is guaranteed that the
fid won't be clunked before the reference counter go down to 0, then
we can avoid the clunked fid to be used.

tests:
race issue test from the old test case:
for file in {01..50}; do touch f.${file}; done
seq 1 1000 | xargs -n 1 -P 50 -I{} cat f.* > /dev/null

open-unlink-f*syscall test:
I have tested for f*syscall include: ftruncate fstat fchown fchmod faccessat.

Link: http://lkml.kernel.org/r/20200923141146.90046-5-jianyong.wu@arm.com
Fixes: 478ba09edc1f ("fs/9p: search open fids first")
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 154372e6 23-Sep-2020 Eric Van Hensbergen <ericvh@gmail.com>

fs/9p: fix create-unlink-getattr idiom

Fixes several outstanding bug reports of not being able to getattr from an
open file after an unlink. This patch cleans up transient fids on an unlink
and will search open fids on a client if it detects a dentry that appears to
have been unlinked. This search is necessary because fstat does not pass fd
information through the VFS API to the filesystem, only the dentry which for
9p has an imperfect match to fids.

Inherent in this patch is also a fix for the qid handling on create/open
which apparently wasn't being set correctly and was necessary for the search
to succeed.

A possible optimization over this fix is to include accounting of open fids
with the inode in the private data (in a similar fashion to the way we track
transient fids with dentries). This would allow a much quicker search for
a matching open fid.

(changed v9fs_fid_find_global to v9fs_fid_find_inode in comment)

Link: http://lkml.kernel.org/r/20200923141146.90046-2-jianyong.wu@arm.com
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>


# 760b3d61 31-Oct-2020 Andrew Lunn <andrew@lunn.ch>

net: 9p: Fix kerneldoc warnings of missing parameters etc

net/9p/client.c:420: warning: Function parameter or member 'c' not described in 'p9_client_cb'
net/9p/client.c:420: warning: Function parameter or member 'req' not described in 'p9_client_cb'
net/9p/client.c:420: warning: Function parameter or member 'status' not described in 'p9_client_cb'
net/9p/client.c:568: warning: Function parameter or member 'uidata' not described in 'p9_check_zc_errors'
net/9p/trans_common.c:23: warning: Function parameter or member 'nr_pages' not described in 'p9_release_pages'
net/9p/trans_common.c:23: warning: Function parameter or member 'pages' not described in 'p9_release_pages'
net/9p/trans_fd.c:132: warning: Function parameter or member 'rreq' not described in 'p9_conn'
net/9p/trans_fd.c:132: warning: Function parameter or member 'wreq' not described in 'p9_conn'
net/9p/trans_fd.c:56: warning: Function parameter or member 'privport' not described in 'p9_fd_opts'
net/9p/trans_rdma.c:113: warning: Function parameter or member 'cqe' not described in 'p9_rdma_context'
net/9p/trans_rdma.c:129: warning: Function parameter or member 'privport' not described in 'p9_rdma_opts'
net/9p/trans_virtio.c:215: warning: Function parameter or member 'limit' not described in 'pack_sg_list_p'
net/9p/trans_virtio.c:83: warning: Function parameter or member 'chan_list' not described in 'virtio_chan'
net/9p/trans_virtio.c:83: warning: Function parameter or member 'p9_max_pages' not described in 'virtio_chan'
net/9p/trans_virtio.c:83: warning: Function parameter or member 'ring_bufs_avail' not described in 'virtio_chan'
net/9p/trans_virtio.c:83: warning: Function parameter or member 'tag' not described in 'virtio_chan'
net/9p/trans_virtio.c:83: warning: Function parameter or member 'vc_wq' not described in 'virtio_chan'

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Dominique Martinet <asmadeus@codewreck.org>
Link: https://lore.kernel.org/r/20201031182655.1082065-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 15e522a7 12-Jul-2020 Andrew Lunn <andrew@lunn.ch>

net: 9p: kerneldoc fixes

Simple fixes which require no deep knowledge of the code.

Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 388f6966 05-Feb-2020 Sergey Alirzaev <l29ah@cock.li>

9pnet: allow making incomplete read requests

A user doesn't necessarily want to wait for all the requested data to
be available, since the waiting time for each request is unbounded.

The new method permits sending one read request at a time and getting
the response ASAP, allowing to use 9pnet with synthetic file systems
representing arbitrary data streams.

Link: http://lkml.kernel.org/r/20200205204053.12751-1-l29ah@cock.li
Signed-off-by: Sergey Alirzaev <l29ah@cock.li>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# 0ce772fe 13-Jun-2019 Lu Shuaibing <shuaibinglu@126.com>

9p: Transport error uninitialized

The p9_tag_alloc() does not initialize the transport error t_err field.
The struct p9_req_t *req is allocated and stored in a struct p9_client
variable. The field t_err is never initialized before p9_conn_cancel()
checks its value.

KUMSAN(KernelUninitializedMemorySantizer, a new error detection tool)
reports this bug.

==================================================================
BUG: KUMSAN: use of uninitialized memory in p9_conn_cancel+0x2d9/0x3b0
Read of size 4 at addr ffff88805f9b600c by task kworker/1:2/1216

CPU: 1 PID: 1216 Comm: kworker/1:2 Not tainted 5.2.0-rc4+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Workqueue: events p9_write_work
Call Trace:
dump_stack+0x75/0xae
__kumsan_report+0x17c/0x3e6
kumsan_report+0xe/0x20
p9_conn_cancel+0x2d9/0x3b0
p9_write_work+0x183/0x4a0
process_one_work+0x4d1/0x8c0
worker_thread+0x6e/0x780
kthread+0x1ca/0x1f0
ret_from_fork+0x35/0x40

Allocated by task 1979:
save_stack+0x19/0x80
__kumsan_kmalloc.constprop.3+0xbc/0x120
kmem_cache_alloc+0xa7/0x170
p9_client_prepare_req.part.9+0x3b/0x380
p9_client_rpc+0x15e/0x880
p9_client_create+0x3d0/0xac0
v9fs_session_init+0x192/0xc80
v9fs_mount+0x67/0x470
legacy_get_tree+0x70/0xd0
vfs_get_tree+0x4a/0x1c0
do_mount+0xba9/0xf90
ksys_mount+0xa8/0x120
__x64_sys_mount+0x62/0x70
do_syscall_64+0x6d/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 0:
(stack is not available)

The buggy address belongs to the object at ffff88805f9b6008
which belongs to the cache p9_req_t of size 144
The buggy address is located 4 bytes inside of
144-byte region [ffff88805f9b6008, ffff88805f9b6098)
The buggy address belongs to the page:
page:ffffea00017e6d80 refcount:1 mapcount:0 mapping:ffff888068b63740 index:0xffff88805f9b7d90 compound_mapcount: 0
flags: 0x100000000010200(slab|head)
raw: 0100000000010200 ffff888068b66450 ffff888068b66450 ffff888068b63740
raw: ffff88805f9b7d90 0000000000100001 00000001ffffffff 0000000000000000
page dumped because: kumsan: bad access detected
==================================================================

Link: http://lkml.kernel.org/r/20190613070854.10434-1-shuaibinglu@126.com
Signed-off-by: Lu Shuaibing <shuaibinglu@126.com>
[dominique.martinet@cea.fr: grouped the added init with the others]
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# 1f327613 28-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 188

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not write to free software
foundation 51 franklin street fifth floor boston ma 02111 1301 usa

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 27 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190528170026.981318839@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# bb06c388 13-Mar-2019 zhengbin <zhengbin13@huawei.com>

9p/net: fix memory leak in p9_client_create

If msize is less than 4096, we should close and put trans, destroy
tagpool, not just free client. This patch fixes that.

Link: http://lkml.kernel.org/m/1552464097-142659-1-git-send-email-zhengbin13@huawei.com
Cc: stable@vger.kernel.org
Fixes: 574d356b7a02 ("9p/net: put a lower bound on msize")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# 574d356b 05-Nov-2018 Dominique Martinet <dominique.martinet@cea.fr>

9p/net: put a lower bound on msize

If the requested msize is too small (either from command line argument
or from the server version reply), we won't get any work done.
If it's *really* too small, nothing will work, and this got caught by
syzbot recently (on a new kmem_cache_create_usercopy() call)

Just set a minimum msize to 4k in both code paths, until someone
complains they have a use-case for a smaller msize.

We need to check in both mount option and server reply individually
because the msize for the first version request would be unchecked
with just a global check on clnt->msize.

Link: http://lkml.kernel.org/r/1541407968-31350-1-git-send-email-asmadeus@codewreck.org
Reported-by: syzbot+0c1d61e4db7db94102ca@syzkaller.appspotmail.com
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: stable@vger.kernel.org


# aa563d7b 19-Oct-2018 David Howells <dhowells@redhat.com>

iov_iter: Separate type from direction and use accessor functions

In the iov_iter struct, separate the iterator type from the iterator
direction and use accessor functions to access them in most places.

Convert a bunch of places to use switch-statements to access them rather
then chains of bitwise-AND statements. This makes it easier to add further
iterator types. Also, this can be more efficient as to implement a switch
of small contiguous integers, the compiler can use ~50% fewer compare
instructions than it has to use bitwise-and instructions.

Further, cease passing the iterator type into the iterator setup function.
The iterator function can set that itself. Only the direction is required.

Signed-off-by: David Howells <dhowells@redhat.com>


# 72ea0321 26-Sep-2018 Dan Carpenter <dan.carpenter@oracle.com>

9p: potential NULL dereference

p9_tag_alloc() is supposed to return error pointers, but we accidentally
return a NULL here. It would cause a NULL dereference in the caller.

Link: http://lkml.kernel.org/m/20180926103934.GA14535@mwanda
Fixes: 996d5b4db4b1 ("9p: Use a slab for allocating requests")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# 728356de 14-Aug-2018 Tomas Bortoli <tomasbortoli@gmail.com>

9p: Add refcount to p9_req_t

To avoid use-after-free(s), use a refcount to keep track of the
usable references to any instantiated struct p9_req_t.

This commit adds p9_req_put(), p9_req_get() and p9_req_try_get() as
wrappers to kref_put(), kref_get() and kref_get_unless_zero().
These are used by the client and the transports to keep track of
valid requests' references.

p9_free_req() is added back and used as callback by kref_put().

Add SLAB_TYPESAFE_BY_RCU as it ensures that the memory freed by
kmem_cache_free() will not be reused for another type until the rcu
synchronisation period is over, so an address gotten under rcu read
lock is safe to inc_ref() without corrupting random memory while
the lock is held.

Link: http://lkml.kernel.org/r/1535626341-20693-1-git-send-email-asmadeus@codewreck.org
Co-developed-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+467050c1ce275af2a5b8@syzkaller.appspotmail.com
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# 43cbcbee 11-Aug-2018 Tomas Bortoli <tomasbortoli@gmail.com>

9p: rename p9_free_req() function

In sight of the next patch to add a refcount in p9_req_t, rename
the p9_free_req() function in p9_release_req().

In the next patch the actual kfree will be moved to another function.

Link: http://lkml.kernel.org/r/20180811144254.23665-1-tomasbortoli@gmail.com
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Acked-by: Jun Piao <piaojun@huawei.com>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# 91a76be3 30-Jul-2018 Dominique Martinet <dominique.martinet@cea.fr>

9p: add a per-client fcall kmem_cache

Having a specific cache for the fcall allocations helps speed up
end-to-end latency.

The caches will automatically be merged if there are multiple caches
of items with the same size so we do not need to try to share a cache
between different clients of the same size.

Since the msize is negotiated with the server, only allocate the cache
after that negotiation has happened - previous allocations or
allocations of different sizes (e.g. zero-copy fcall) are made with
kmalloc directly.

Some figures on two beefy VMs with Connect-IB (sriov) / trans=rdma,
with ior running 32 processes in parallel doing small 32 bytes IOs:
- no alloc (4.18-rc7 request cache): 65.4k req/s
- non-power of two alloc, no patch: 61.6k req/s
- power of two alloc, no patch: 62.2k req/s
- non-power of two alloc, with patch: 64.7k req/s
- power of two alloc, with patch: 65.1k req/s

Link: http://lkml.kernel.org/r/1532943263-24378-2-git-send-email-asmadeus@codewreck.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Greg Kurz <groug@kaod.org>


# 523adb6c 29-Jul-2018 Dominique Martinet <dominique.martinet@cea.fr>

9p: embed fcall in req to round down buffer allocs

'msize' is often a power of two, or at least page-aligned, so avoiding
an overhead of two dozen bytes for each allocation will help the
allocator do its work and reduce memory fragmentation.

Link: http://lkml.kernel.org/r/1533825236-22896-1-git-send-email-asmadeus@codewreck.org
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>


# 996d5b4d 11-Jul-2018 Matthew Wilcox <willy@infradead.org>

9p: Use a slab for allocating requests

Replace the custom batch allocation with a slab. Use an IDR to store
pointers to the active requests instead of an array. We don't try to
handle P9_NOTAG specially; the IDR will happily shrink all the way back
once the TVERSION call has completed.

Link: http://lkml.kernel.org/r/20180711210225.19730-6-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# c69f297d 24-Jul-2018 Stephen Hemminger <stephen@networkplumber.org>

9p: fix whitespace issues

Remove trailing whitespace and blank lines at EOF

Link: http://lkml.kernel.org/m/20180724192918.31165-11-sthemmin@microsoft.com
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# f984579a 23-Jul-2018 Tomas Bortoli <tomasbortoli@gmail.com>

9p: validate PDU length

This commit adds length check for the PDU size.
The size contained in the header has to match the actual size,
except for TCP (trans_fd.c) where actual length is not known ahead
and the header's length will be checked only against the validity
range.

Link: http://lkml.kernel.org/r/20180723154404.2406-1-tomasbortoli@gmail.com
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+65c6b72f284a39d416b4@syzkaller.appspotmail.com
To: Eric Van Hensbergen <ericvh@gmail.com>
To: Ron Minnich <rminnich@sandia.gov>
To: Latchesar Ionkov <lucho@ionkov.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# 2557d0c5 11-Jul-2018 Matthew Wilcox <willy@infradead.org>

9p: Embed wait_queue_head into p9_req_t

On a 64-bit system, the wait_queue_head_t is 24 bytes while the pointer
to it is 8 bytes. Growing the p9_req_t by 16 bytes is better than
performing a 24-byte memory allocation.

Link: http://lkml.kernel.org/r/20180711210225.19730-5-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# f28cdf04 11-Jul-2018 Matthew Wilcox <willy@infradead.org>

9p: Replace the fidlist with an IDR

The p9_idpool being used to allocate the IDs uses an IDR to allocate
the IDs ... which we then keep in a doubly-linked list, rather than in
the IDR which allocated them. We can use an IDR directly which saves
two pointers per p9_fid, and a tiny memory allocation per p9_client.

Link: http://lkml.kernel.org/r/20180711210225.19730-4-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# b5303be2 11-Jul-2018 Matthew Wilcox <willy@infradead.org>

9p: Change p9_fid_create calling convention

Return NULL instead of ERR_PTR when we can't allocate a FID. The ENOSPC
return value was getting all the way back to userspace, and that's
confusing for a userspace program which isn't expecting read() to tell it
there's no space left on the filesystem. The best error we can return to
indicate a temporary failure caused by lack of client resources is ENOMEM.

Maybe it would be better to sleep until a FID is available, but that's
not a change I'm comfortable making.

Link: http://lkml.kernel.org/r/20180711210225.19730-3-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Yiwen Jiang <jiangyiwen@huwei.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# 2d58f63f 11-Jul-2018 Matthew Wilcox <willy@infradead.org>

9p: Fix comment on smp_wmb

The previous comment misled me into thinking the barrier wasn't needed
at all.

Link: http://lkml.kernel.org/r/20180711210225.19730-2-willy@infradead.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# 7913690d 09-Jul-2018 Tomas Bortoli <tomasbortoli@gmail.com>

net/9p/client.c: version pointer uninitialized

The p9_client_version() does not initialize the version pointer. If the
call to p9pdu_readf() returns an error and version has not been allocated
in p9pdu_readf(), then the program will jump to the "error" label and will
try to free the version pointer. If version is not initialized, free()
will be called with uninitialized, garbage data and will provoke a crash.

Link: http://lkml.kernel.org/r/20180709222943.19503-1-tomasbortoli@gmail.com
Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com>
Reported-by: syzbot+65c6b72f284a39d416b4@syzkaller.appspotmail.com
Reviewed-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# 64ad31f3 10-Jul-2018 piaojun <piaojun@huawei.com>

net/9p/client.c: add missing '\n' at the end of p9_debug()

In p9_client_getattr_dotl(), we should add '\n' at the end of printing
log.

Link: http://lkml.kernel.org/r/5B44589A.50302@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>


# c290fba8 13-Jul-2018 piaojun <piaojun@huawei.com>

net/9p/client.c: put refcount of trans_mod in error case in parse_opts()

In my testing, the second mount will fail after umounting successfully.
The reason is that we put refcount of trans_mod in the correct case
rather than the error case in parse_opts() at last. That will cause the
refcount decrease to -1, and when we try to get trans_mod again in
try_module_get(), we could only increase refcount to 0 which will cause
failure as follows:

parse_opts
v9fs_get_trans_by_name
try_module_get : return NULL to caller which cause error

So we should put refcount of trans_mod in error case.

Link: http://lkml.kernel.org/r/5B3F39A0.2030509@huawei.com
Fixes: 9421c3e64137ec ("net/9p/client.c: fix potential refcnt problem of trans module")
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Dominique Martinet <dominique.martinet@cea.fr>
Tested-by: Dominique Martinet <dominique.martinet@cea.fr>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8d856c72 07-Jun-2018 Chengguang Xu <cgxu519@gmx.com>

net/9p: detect invalid options as much as possible

Currently when detecting invalid options in option parsing, some
options(e.g. msize) just set errno and allow to continuously validate
other options so that it can detect invalid options as much as possible
and give proper error messages together.

This patch applies same rule to option 'trans' and 'version' when
detecting -EINVAL.

Link: http://lkml.kernel.org/r/1525340676-34072-1-git-send-email-cgxu519@gmx.com
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 9421c3e6 05-Apr-2018 Chengguang Xu <cgxu519@gmx.com>

net/9p/client.c: fix potential refcnt problem of trans module

When specifying trans_mod multiple times in a mount, it will cause an
inaccurate refcount of the trans module. Also, in the error case of
option parsing, we should put the trans module if we have already got
it.

Link: http://lkml.kernel.org/r/1522154942-57339-1-git-send-email-cgxu519@gmx.com
Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# a8522243 05-Apr-2018 Greg Kurz <groug@kaod.org>

net/9p: avoid -ERESTARTSYS leak to userspace

If it was interrupted by a signal, the 9p client may need to send some
more requests to the server for cleanup before returning to userspace.

To avoid such a last minute request to be interrupted right away, the
client memorizes if a signal is pending, clears TIF_SIGPENDING, handles
the request and calls recalc_sigpending() before returning.

Unfortunately, if the transmission of this cleanup request fails for any
reason, the transport returns an error and the client propagates it
right away, without calling recalc_sigpending().

This ends up with -ERESTARTSYS from the initially interrupted request
crawling up to syscall exit, with TIF_SIGPENDING cleared by the cleanup
request. The specific signal handling code, which is responsible for
converting -ERESTARTSYS to -EINTR is not called, and userspace receives
the confusing errno value:

open: Unknown error 512 (512)

This is really hard to hit in real life. I discovered the issue while
working on hot-unplug of a virtio-9p-pci device with an instrumented
QEMU allowing to control request completion.

Both p9_client_zc_rpc() and p9_client_rpc() functions have this buggy
error path actually. Their code flow is a bit obscure and the best
thing to do would probably be a full rewrite: to really ensure this
situation of clearing TIF_SIGPENDING and returning -ERESTARTSYS can
never happen.

But given the general lack of interest for the 9p code, I won't risk
breaking more things. So this patch simply fixes the buggy paths in
both functions with a trivial label+goto.

Thanks to Laurent Dufour for his help and suggestions on how to find the
root cause and how to fix it.

Link: http://lkml.kernel.org/r/152062809886.10599.7361006774123053312.stgit@bahia.lan
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: David Miller <davem@davemloft.net>
Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 61b272c3 19-Nov-2017 Tuomas Tynkkynen <tuomas@tuxera.com>

9p: Fix missing commas in mount options

Since commit c4fac9100456 ("9p: Implement show_options"), the mount
options of 9p filesystems are printed out with some missing commas
between the individual options:

p9-scratch on /mnt/scratch type 9p (rw,dirsync,loose,access=clienttrans=virtio)

Add them back.

Cc: stable@vger.kernel.org # 4.13+
Fixes: c4fac9100456 ("9p: Implement show_options")
Signed-off-by: Tuomas Tynkkynen <tuomas@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 9523feac 06-Sep-2017 Tuomas Tynkkynen <tuomas@tuxera.com>

net/9p: Switch to wait_event_killable()

Because userspace gets Very Unhappy when calls like stat() and execve()
return -EINTR on 9p filesystem mounts. For instance, when bash is
looking in PATH for things to execute and some SIGCHLD interrupts
stat(), bash can throw a spurious 'command not found' since it doesn't
retry the stat().

In practice, hitting the problem is rare and needs a really
slow/bogged down 9p server.

Cc: stable@vger.kernel.org
Signed-off-by: Tuomas Tynkkynen <tuomas@tuxera.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# c4fac910 05-Jul-2017 David Howells <dhowells@redhat.com>

9p: Implement show_options

Implement the show_options superblock op for 9p as part of a bid to get
rid of s_options and generic_show_options() to make it easier to implement
a context-based mount where the mount options can be passed individually
over a file descriptor.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Ron Minnich <rminnich@sandia.gov>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: v9fs-developer@lists.sourceforge.net
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 1c512a7c 17-Feb-2017 Al Viro <viro@zeniv.linux.org.uk>

net/9p: switch to copy_from_iter_full()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 71d6ad08 14-Apr-2017 Al Viro <viro@zeniv.linux.org.uk>

p9_client_readdir() fix

Don't assume that server is sane and won't return more data than
asked for.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 3f07c014 08-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h>

We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 7880b43b 12-Jan-2017 Al Viro <viro@zeniv.linux.org.uk>

9p: constify ->d_name handling

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 287980e4 27-May-2016 Arnd Bergmann <arnd@arndb.de>

remove lots of IS_ERR_VALUE abuses

Most users of IS_ERR_VALUE() in the kernel are wrong, as they
pass an 'int' into a function that takes an 'unsigned long'
argument. This happens to work because the type is sign-extended
on 64-bit architectures before it gets converted into an
unsigned type.

However, anything that passes an 'unsigned short' or 'unsigned int'
argument into IS_ERR_VALUE() is guaranteed to be broken, as are
8-bit integers and types that are wider than 'unsigned long'.

Andrzej Hajda has already fixed a lot of the worst abusers that
were causing actual bugs, but it would be nice to prevent any
users that are not passing 'unsigned long' arguments.

This patch changes all users of IS_ERR_VALUE() that I could find
on 32-bit ARM randconfig builds and x86 allmodconfig. For the
moment, this doesn't change the definition of IS_ERR_VALUE()
because there are probably still architecture specific users
elsewhere.

Almost all the warnings I got are for files that are better off
using 'if (err)' or 'if (err < 0)'.
The only legitimate user I could find that we get a warning for
is the (32-bit only) freescale fman driver, so I did not remove
the IS_ERR_VALUE() there but changed the type to 'unsigned long'.
For 9pfs, I just worked around one user whose calling conventions
are so obscure that I did not dare change the behavior.

I was using this definition for testing:

#define IS_ERR_VALUE(x) ((unsigned long*)NULL == (typeof (x)*)NULL && \
unlikely((unsigned long long)(x) >= (unsigned long long)(typeof(x))-MAX_ERRNO))

which ends up making all 16-bit or wider types work correctly with
the most plausible interpretation of what IS_ERR_VALUE() was supposed
to return according to its users, but also causes a compile-time
warning for any users that do not pass an 'unsigned long' argument.

I suggested this approach earlier this year, but back then we ended
up deciding to just fix the users that are obviously broken. After
the initial warning that caused me to get involved in the discussion
(fs/gfs2/dir.c) showed up again in the mainline kernel, Linus
asked me to send the whole thing again.

[ Updated the 9p parts as per Al Viro - Linus ]

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andrzej Hajda <a.hajda@samsung.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.org/lkml/2016/1/7/363
Link: https://lkml.org/lkml/2016/5/27/486
Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> # For nvmem part
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 999b8b88 15-Aug-2015 Vincent Bernat <vincent@bernat.im>

9p: ensure err is initialized to 0 in p9_client_read/write

Some use of those functions were providing unitialized values to those
functions. Notably, when reading 0 bytes from an empty file on a 9P
filesystem, the return code of read() was not 0.

Tested with this simple program:

#include <assert.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

int main(int argc, const char **argv)
{
assert(argc == 2);
char buffer[256];
int fd = open(argv[1], O_RDONLY|O_NOCTTY);
assert(fd >= 0);
assert(read(fd, buffer, 0) == 0);
return 0;
}

Cc: stable@vger.kernel.org # v4.1
Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 0f1db7de 04-Jul-2015 Al Viro <viro@zeniv.linux.org.uk>

9p: cope with bogus responses from server in p9_client_{read,write}

if server claims to have written/read more than we'd told it to,
warn and cap the claimed byte count to avoid advancing more than
we are ready to.


# 67e808fb 04-Jul-2015 Al Viro <viro@zeniv.linux.org.uk>

p9_client_write(): avoid double p9_free_req()

Braino in "9p: switch p9_client_write() to passing it struct iov_iter *";
if response is impossible to parse and we discard the request, get the
out of the loop right there.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# a84b69cb 04-Jul-2015 Al Viro <viro@zeniv.linux.org.uk>

9p: forgetting to cancel request on interrupted zero-copy RPC

If we'd already sent a request and decide to abort it, we *must*
issue TFLUSH properly and not just blindly reuse the tag, or
we'll get seriously screwed when response eventually arrives
and we confuse it for response to later request that had reused
the same tag.

Cc: stable@vger.kernel.org # v3.2 and later
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 21c9f5cc 02-Apr-2015 Al Viro <viro@zeniv.linux.org.uk>

p9_client_attach(): set fid->uid correctly

it's almost always equal to current_fsuid(), but there's an exception -
if the first writeback fid is opened by non-root *and* that happens before
root has done any lookups in /, we end up doing attach for root. The
current code leaves the resulting FID owned by root from the server POV
and by non-root from the client one. Unfortunately, it means that e.g.
massive dcache eviction will leave that user buggered - they'll end
up redoing walks from / *and* picking that FID every time. As soon as
they try to create something, the things will get nasty.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# e1200fe6 01-Apr-2015 Al Viro <viro@zeniv.linux.org.uk>

9p: switch p9_client_read() to passing struct iov_iter *

... and make it loop

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 070b3656 01-Apr-2015 Al Viro <viro@zeniv.linux.org.uk>

9p: switch p9_client_write() to passing it struct iov_iter *

... and make it loop until it's done

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 4f3b35c1 01-Apr-2015 Al Viro <viro@zeniv.linux.org.uk>

net/9p: switch the guts of p9_client_{read,write}() to iov_iter

... and have get_user_pages_fast() mapping fewer pages than requested
to generate a short read/write.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# d8282ea0 14-Jul-2014 Fabian Frederick <fabf@skynet.be>

9P: remove unnecessary break after return

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 0bfd6845 10-Mar-2014 Simon Derr <simon.derr@bull.net>

9P: Get rid of REQ_STATUS_FLSH

This request state is mostly useless, and properly implementing it
for RDMA would require an extra lock to be taken in handle_recv()
and in rdma_cancel() to avoid this race:

handle_recv() rdma_cancel()
. .
. if req->state == SENT
req->state = RCVD .
. req->state = FLSH

So just get rid of it.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# afd8d654 10-Mar-2014 Simon Derr <simon.derr@bull.net>

9P: Add cancelled() to the transport functions.

And move transport-specific code out of net/9p/client.c

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 05a782d4 09-Feb-2014 Rashika <rashika.kheria@gmail.com>

net: Mark function as static in 9p/client.c

Mark function as static in net/9p/client.c because it is not used
outside this file.

This eliminates the following warning in net/9p/client.c:
net/9p/client.c:207:18: warning: no previous prototype for ‘p9_fcall_alloc’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 2b6e72ed 17-Jan-2014 Dominique Martinet <dominique.martinet@cea.fr>

9P: Add memory barriers to protect request fields over cb/rpc threads handoff

We need barriers to guarantee this pattern works as intended:
[w] req->rc, 1 [r] req->status, 1
wmb rmb
[w] req->status, 1 [r] req->rc

Where the wmb ensures that rc gets written before status,
and the rmb ensures that if you observe status == 1, rc is the new value.

Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# fd944bde 09-Feb-2014 Rashika Kheria <rashika.kheria@gmail.com>

net: Mark function as static in 9p/client.c

Mark function as static in net/9p/client.c because it is not used
outside this file.

This eliminates the following warning in net/9p/client.c:
net/9p/client.c:207:18: warning: no previous prototype for ‘p9_fcall_alloc’ [-Wmissing-prototypes]

Signed-off-by: Rashika Kheria <rashika.kheria@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f94741fd 12-Nov-2013 Eric Van Hensbergen <ericvh@gmail.com>

net/9p: remove virtio default hack and set appropriate bits instead

A few releases back a patch made virtio the default transport, however
it was done in a way which side-stepped the mechanism put in place to
allow for this selection. This patch cleans that up while maintaining
virtio as the default transport.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 50192abe 21-Aug-2013 Will Deacon <will@kernel.org>

fs/9p: avoid accessing utsname after namespace has been torn down

During trinity fuzzing in a kvmtool guest, I stumbled across the
following:

Unable to handle kernel NULL pointer dereference at virtual address 00000004
PC is at v9fs_file_do_lock+0xc8/0x1a0
LR is at v9fs_file_do_lock+0x48/0x1a0
[<c01e2ed0>] (v9fs_file_do_lock+0xc8/0x1a0) from [<c0119154>] (locks_remove_flock+0x8c/0x124)
[<c0119154>] (locks_remove_flock+0x8c/0x124) from [<c00d9bf0>] (__fput+0x58/0x1e4)
[<c00d9bf0>] (__fput+0x58/0x1e4) from [<c0044340>] (task_work_run+0xac/0xe8)
[<c0044340>] (task_work_run+0xac/0xe8) from [<c002e36c>] (do_exit+0x6bc/0x8d8)
[<c002e36c>] (do_exit+0x6bc/0x8d8) from [<c002e674>] (do_group_exit+0x3c/0xb0)
[<c002e674>] (do_group_exit+0x3c/0xb0) from [<c002e6f8>] (__wake_up_parent+0x0/0x18)

I believe this is due to an attempt to access utsname()->nodename, after
exit_task_namespaces() has been called, leaving current->nsproxy->uts_ns
as NULL and causing the above dereference.

A similar issue was fixed for lockd in 9a1b6bf818e7 ("LOCKD: Don't call
utsname()->nodename from nlmclnt_setlockargs"), so this patch attempts
something similar for 9pfs.

Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 60ff779c 25-Jul-2013 Andi Shyti <andi@etezian.org>

9p: client: remove unused code and any reference to "cancelled" function

This patch reverts commit

80b45261a0b263536b043c5ccfc4ba4fc27c2acc

which was implementing a 'cancelled' functionality to notify that
a cancelled request will not be replied.

This implementation was not used anywhere and therefore removed.

Signed-off-by: Andi Shyti <andi@etezian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 80b45261 21-Jun-2013 Simon Derr <simon.derr@bull.net>

9P: Add cancelled() to the transport functions.

RDMA needs to post a buffer for each incoming reply.
Hence it needs to keep count of these and needs to be
aware of whether a flushed request has received a reply
or not.

This patch adds the cancelled() callback to the transport modules.
It is called when RFLUSH has been received and that the corresponding
request will never receive a reply.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 1cff3306 21-Jun-2013 Simon Derr <simon.derr@bull.net>

9P/RDMA: count posted buffers without a pending request

In rdma_request():

If an error occurs between posting the recv and the send,
there will be a reply context posted without a pending
request.
Since there is no way to "un-post" it, we remember it and
skip post_recv() for the next request.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 5387320d 21-Jun-2013 Simon Derr <simon.derr@bull.net>

9pnet: refactor struct p9_fcall alloc code

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# ea071aa1 21-Jun-2013 Simon Derr <simon.derr@bull.net>

9P: Fix fcall allocation for rdma

The current code assumes that when a request in the request array
does have a tc, it also has a rc.

This is normally true, but not always : when using RDMA, req->rc
will temporarily be set to NULL after the request has been sent.
That is usually OK though, as when the reply arrives, req->rc will be
reassigned to a sane value before the request is recycled.

But there is a catch : if the request is flushed, the reply will never
arrive, and req->rc will be NULL, but not req->tc.

This patch fixes p9_tag_alloc to take this into account.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 6390460a 20-May-2013 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Handle error in zero copy request correctly for 9p2000.u

For zero copy request, error will be encoded in the user space buffer.
So copy the error code correctly using copy_from_user. Here we use the
extra bytes we allocate for zero copy request. If total error details
are more than P9_ZC_HDR_SZ - 7 bytes, we return -EFAULT. The patch also
avoid a memory allocation in the error path.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 42fe6484 20-May-2013 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Handle error in zero copy request correctly for 9p2000.u

For zero copy request, error will be encoded in the user space buffer.
So copy the error code correctly using copy_from_user. Here we use the
extra bytes we allocate for zero copy request. If total error details
are more than P9_ZC_HDR_SZ - 7 bytes, we return -EFAULT. The patch also
avoid a memory allocation in the error path.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 535bcd3c 19-May-2013 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Use virtio transpart as the default transport

Make the default 9p experience better by defaulting to virtio transport if present.
These days most of the users are using 9p in a virtualized setup

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 095e7999 19-May-2013 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Make 9P2000.L the default protocol for 9p file system

If we dont' specify a protocol version default to 9P2000.L. 9P2000.L
have better support for posix semantic and is where all the recent development
is happening.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 447c5094 29-Jan-2013 Eric W. Biederman <ebiederm@xmission.com>

9p: Modify the stat structures to use kuid_t and kgid_t

9p has thre strucrtures that can encode inode stat information. Modify
all of those structures to contain kuid_t and kgid_t values. Modify
he wire encoders and decoders of those structures to use 'u' and 'g' instead of
'd' in the format string where uids and gids are present.

This results in all kuid and kgid conversion to and from on the wire values
being performed by the same code in protocol.c where the client is known
at the time of the conversion.

Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>


# f791f7c5 29-Jan-2013 Eric W. Biederman <ebiederm@xmission.com>

9p: Transmit kuid and kgid values

Modify the p9_client_rpc format specifiers of every function that
directly transmits a uid or a gid from 'd' to 'u' or 'g' as
appropriate.

Modify those same functions to take kuid_t and kgid_t parameters
instead of uid_t and gid_t parameters.

Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>


# 43def35c 10-Aug-2012 Simon Derr <simon.derr@bull.net>

net/9p: Check errno validity

While working on a modified server I had the Linux clients crash
a few times. This lead me to find this:

Some error codes are directly extracted from the server replies.
A malformed server reply could contain an invalid error code, with a
very large value. If this value is then passed to ERR_PTR() it will
not be properly detected as an error code by IS_ERR() and as a result
the kernel will dereference an invalid pointer.

This patch tries to avoid this.

Signed-off-by: Simon Derr <simon.derr@bull.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# f07d9010 04-Jun-2012 Joe Perches <joe@perches.com>

net/9p: Add __force to cast of __user pointer

A recent commit that removed unnecessary casts of pointers
to the same type uncovered a missing __force cast.

Add it.

Reported by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e3192690 03-Jun-2012 Joe Perches <joe@perches.com>

net: Remove casts to same type

Adding casts of objects to the same type is unnecessary
and confusing for a human reader.

For example, this cast:

int y;
int *p = (int *)&y;

I used the coccinelle script below to find and remove these
unnecessary casts. I manually removed the conversions this
script produces of casts with __force and __user.

@@
type T;
T *p;
@@

- (T *)p
+ p

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 95c96174 14-Apr-2012 Eric Dumazet <eric.dumazet@gmail.com>

net: cleanup unsigned to unsigned int

Use of "unsigned int" is preferred to bare "unsigned" in net tree.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 208f3c28 26-Feb-2012 Jim Garlick <garlick@llnl.gov>

net/9p: handle flushed Tclunk/Tremove

When a Tclunk or Tremove request is flushed, the fid is not freed on the
server.

p9_client_clunk() should retry once on interrupt, then if interrupted
again, leak the fid for the duration of the connection.

p9_client_remove() should call p9_client_clunk() on interrupt
instead of unconditionally destroying the fid.

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# a314f274 01-Feb-2012 Jim Garlick <garlick@llnl.gov>

net/9p: don't allow Tflush to be interrupted

When a signal is received while sending a Tflush, the client,
which has recursed into p9_client_rpc() while sending another request,
should wait for Rflush as long as the transport is still up.

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 5d385153 28-Nov-2011 Joe Perches <joe@perches.com>

9p: Reduce object size with CONFIG_NET_9P_DEBUG

Reduce object size by deduplicating formats.

Use vsprintf extension %pV.
Rename P9_DPRINTK uses to p9_debug, align arguments.
Add function for _p9_debug and macro to add __func__.
Add missing "\n"s to p9_debug uses.
Remove embedded function names as p9_debug adds it.
Remove P9_EPRINTK macro and convert use to pr_<level>.
Add and use pr_fmt and pr_<level>.

$ size fs/9p/built-in.o*
text data bss dec hex filename
62133 984 16000 79117 1350d fs/9p/built-in.o.new
67342 984 16928 85254 14d06 fs/9p/built-in.o.old
$ size net/9p/built-in.o*
text data bss dec hex filename
88792 4148 22024 114964 1c114 net/9p/built-in.o.new
94072 4148 23232 121452 1da6c net/9p/built-in.o.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 348b5901 06-Aug-2011 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Convert net/9p protocol dumps to tracepoints

This helps in more control over debugging.
root@qemu-img-64:~# ls /pass/123
ls: cannot access /pass/123: No such file or directory
root@qemu-img-64:~# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
ls-1536 [001] 70.928584: 9p_protocol_dump: clnt 18446612132784021504 P9_TWALK(tag = 1)
000: 16 00 00 00 6e 01 00 01 00 00 00 02 00 00 00 01
010: 00 03 00 31 32 33 00 00 00 ff ff ff ff 00 00 00

ls-1536 [001] 70.928587: <stack trace>
=> trace_9p_protocol_dump
=> p9pdu_finalize
=> p9_client_rpc
=> p9_client_walk
=> v9fs_vfs_lookup
=> d_alloc_and_lookup
=> walk_component
=> path_lookupat
ls-1536 [000] 70.929696: 9p_protocol_dump: clnt 18446612132784021504 P9_RLERROR(tag = 1)
000: 0b 00 00 00 07 01 00 02 00 00 00 4e 03 00 02 00
010: 00 00 00 00 03 00 02 00 00 00 00 00 ff 43 00 00

ls-1536 [000] 70.929697: <stack trace>
=> trace_9p_protocol_dump
=> p9_client_rpc
=> p9_client_walk
=> v9fs_vfs_lookup
=> d_alloc_and_lookup
=> walk_component
=> path_lookupat
=> do_path_lookup

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# ef6b0807 26-Aug-2011 Dan Carpenter <error27@gmail.com>

fs/9p: change an int to unsigned int

Without this msize=4294967295 will result in a crash

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 4d5077f1 29-Aug-2011 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

fs/9p: Cleanup option parsing in 9p

Instead of saying all integer argument option should be listed in the beginning
move integer parsing to each option type.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 5635fd0c 26-Aug-2011 Dan Carpenter <error27@gmail.com>

9p: move dereference after NULL check

We dereferenced "req->tc" and "req->rc" before checking for NULL.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# abfa034e 15-Aug-2011 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

fs/9p: Update zero-copy implementation in 9p

* remove lot of update to different data structure
* add a seperate callback for zero copy request.
* above makes non zero copy code path simpler
* remove conditionalizing TREAD/TREADDIR/TWRITE in the zero copy path
* Fix the dotu p9_check_errors with zero copy. Add sufficient doc around
* Add support for both in and output buffers in zero copy callback
* pin and unpin pages in the same context
* use helpers instead of defining page offset and rest of page ourself
* Fix mem leak in p9_check_errors
* Remove 'E' and 'F' in p9pdu_vwritef

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# c9ffb05c 29-Jun-2011 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

net/9p: Fix the msize calculation.

msize represents the maximum PDU size that includes P9_IOHDRSZ.

Signed-off-by: Venkateswararao Jujjuri "<jvrao@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 48e370ff 28-Jun-2011 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

fs/9p: add 9P2000.L unlinkat operation

unlinkat - Remove a directory entry

size[4] Tunlinkat tag[2] dirfid[4] name[s] flag[4]
size[4] Runlinkat tag[2]

older Tremove have the below request format

size[4] Tremove tag[2] fid[4]

The remove message is used to remove a directory entry either file or directory
The remove opreation is actually a directory opertation and should ideally have
dirfid, if not we cannot represent the fid on server with anything other than
name. We will have to derive the directory name from fid in the Tremove request.

NOTE: The operation doesn't clunk the unlink fid.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 9e8fb38e 28-Jun-2011 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

fs/9p: add 9P2000.L renameat operation

renameat - change name of file or directory

size[4] Trenameat tag[2] olddirfid[4] oldname[s] newdirfid[4] newname[s]
size[4] Rrenameat tag[2]

older Trename have the below request format

size[4] Trename tag[2] fid[4] newdirfid[4] name[s]

The rename message is used to change the name of a file, possibly moving it
to a new directory. The rename opreation is actually a directory opertation
and should ideally have olddirfid, if not we cannot represent the fid on server
with anything other than name. We will have to derive the old directory name
from fid in the Trename request.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 4d63055f 06-May-2011 Prem Karat <prem.karat@linux.vnet.ibm.com>

fs/9p: Clean-up get_protocol_version() to use strcmp

Signed-off-by: Prem Karat <prem.karat@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 5034990e 11-Jul-2011 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

fs/9p: Fid is not valid after a failed clunk.

free the fid even in case of failed clunk.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# e660a828 19-Jun-2011 Eric Van Hensbergen <ericvh@gmail.com>

9p: clean up packet dump code

Switch to generic kernel hexdump library and cleanup macros to
be more consistent with the way we do normal debug prints.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# b85f7d92 13-Jul-2011 Eric Van Hensbergen <ericvh@gmail.com>

net/9p: fix client code to fail more gracefully on protocol error

There was a BUG_ON to protect against a bad id which could be dealt with
more gracefully.

Reported-by: Natalie Orlin <norlin@us.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# fe1cbaba 20-May-2011 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: p9_idpool_get return -1 on error

We need to return -1 on error. Also handle error properly

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# aca00763 08-May-2011 Rob Landley <rob@landley.net>

9p: typo fixes and minor cleanups

Typo fixes and minor cleanups for v9fs

Signed-off-by: Rob Landley <rob@landley.net>
Reviewed-by: Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 3cd79678 15-Apr-2011 M. Mohan Kumar <mohan@in.ibm.com>

net/9p: Handle get_user_pages_fast return properly

Use proper data type to handle get_user_pages_fast error condition. Also
do not treat EFAULT error as fatal.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# b76225e2 31-Mar-2011 Harsh Prateek Bora <harsh@linux.vnet.ibm.com>

net/9p: nwname should be an unsigned int

Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric VAn Hensbergen <ericvh@gmail.com>


# df5d8c80 24-Mar-2011 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

9p: revert tsyncfs related changes

Now that we use write_inode to flush server
cache related to fid, we don't need tsyncfs either fort dotl or dotu
protocols. For dotu this helps to do a more efficient server flush.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 095d3da6 12-Apr-2011 David S. Miller <davem@davemloft.net>

9p: Kill set but unused variable in 9p_client_{read,write}() and p9_client_readdir()

Fixes the following warnings:

net/9p/client.c:1305:18: warning: variable ‘total’ set but not used [-Wunused-but-set-variable]
net/9p/client.c:1370:18: warning: variable ‘total’ set but not used [-Wunused-but-set-variable]
net/9p/client.c:1769:18: warning: variable ‘total’ set but not used [-Wunused-but-set-variable]

Signed-off-by: David S. Miller <davem@davemloft.net>


# 25985edc 30-Mar-2011 Lucas De Marchi <lucas.demarchi@profusion.mobi>

Fix common misspellings

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>


# eeff66ef 08-Mar-2011 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Convert the in the 9p rpc call path to GFP_NOFS

Without this we can cause reclaim allocation in writepage.

[ 3433.448430] =================================
[ 3433.449117] [ INFO: inconsistent lock state ]
[ 3433.449117] 2.6.38-rc5+ #84
[ 3433.449117] ---------------------------------
[ 3433.449117] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
[ 3433.449117] kswapd0/505 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 3433.449117] (iprune_sem){+++++-}, at: [<ffffffff810ebbab>] shrink_icache_memory+0x45/0x2b1
[ 3433.449117] {RECLAIM_FS-ON-W} state was registered at:
[ 3433.449117] [<ffffffff8107fe5f>] mark_held_locks+0x52/0x70
[ 3433.449117] [<ffffffff8107ff02>] lockdep_trace_alloc+0x85/0x9f
[ 3433.449117] [<ffffffff810d353d>] slab_pre_alloc_hook+0x18/0x3c
[ 3433.449117] [<ffffffff810d3fd5>] kmem_cache_alloc+0x23/0xa2
[ 3433.449117] [<ffffffff8127be77>] idr_pre_get+0x2d/0x6f
[ 3433.449117] [<ffffffff815434eb>] p9_idpool_get+0x30/0xae
[ 3433.449117] [<ffffffff81540123>] p9_client_rpc+0xd7/0x9b0
[ 3433.449117] [<ffffffff815427b0>] p9_client_clunk+0x88/0xdb
[ 3433.449117] [<ffffffff811d56e5>] v9fs_evict_inode+0x3c/0x48
[ 3433.449117] [<ffffffff810eb511>] evict+0x1f/0x87
[ 3433.449117] [<ffffffff810eb5c0>] dispose_list+0x47/0xe3
[ 3433.449117] [<ffffffff810eb8da>] evict_inodes+0x138/0x14f
[ 3433.449117] [<ffffffff810d90e2>] generic_shutdown_super+0x57/0xe8
[ 3433.449117] [<ffffffff810d91e8>] kill_anon_super+0x11/0x50
[ 3433.449117] [<ffffffff811d4951>] v9fs_kill_super+0x49/0xab
[ 3433.449117] [<ffffffff810d926e>] deactivate_locked_super+0x21/0x46
[ 3433.449117] [<ffffffff810d9e84>] deactivate_super+0x40/0x44
[ 3433.449117] [<ffffffff810ef848>] mntput_no_expire+0x100/0x109
[ 3433.449117] [<ffffffff810f0aeb>] sys_umount+0x2f1/0x31c
[ 3433.449117] [<ffffffff8102c87b>] system_call_fastpath+0x16/0x1b
[ 3433.449117] irq event stamp: 192941
[ 3433.449117] hardirqs last enabled at (192941): [<ffffffff81568dcf>] _raw_spin_unlock_irq+0x2b/0x30
[ 3433.449117] hardirqs last disabled at (192940): [<ffffffff810b5f97>] shrink_inactive_list+0x290/0x2f5
[ 3433.449117] softirqs last enabled at (188470): [<ffffffff8105fd65>] __do_softirq+0x133/0x152
[ 3433.449117] softirqs last disabled at (188455): [<ffffffff8102d7cc>] call_softirq+0x1c/0x28
[ 3433.449117]
[ 3433.449117] other info that might help us debug this:
[ 3433.449117] 1 lock held by kswapd0/505:
[ 3433.449117] #0: (shrinker_rwsem){++++..}, at: [<ffffffff810b52e2>] shrink_slab+0x38/0x15f
[ 3433.449117]
[ 3433.449117] stack backtrace:
[ 3433.449117] Pid: 505, comm: kswapd0 Not tainted 2.6.38-rc5+ #84
[ 3433.449117] Call Trace:
[ 3433.449117] [<ffffffff8107fbce>] ? valid_state+0x17e/0x191
[ 3433.449117] [<ffffffff81036896>] ? save_stack_trace+0x28/0x45
[ 3433.449117] [<ffffffff81080426>] ? check_usage_forwards+0x0/0x87
[ 3433.449117] [<ffffffff8107fcf4>] ? mark_lock+0x113/0x22c
[ 3433.449117] [<ffffffff8108105f>] ? __lock_acquire+0x37a/0xcf7
[ 3433.449117] [<ffffffff8107fc0e>] ? mark_lock+0x2d/0x22c
[ 3433.449117] [<ffffffff81081077>] ? __lock_acquire+0x392/0xcf7
[ 3433.449117] [<ffffffff810b14d2>] ? determine_dirtyable_memory+0x15/0x28
[ 3433.449117] [<ffffffff81081a33>] ? lock_acquire+0x57/0x6d
[ 3433.449117] [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117] [<ffffffff81567d85>] ? down_read+0x47/0x5c
[ 3433.449117] [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117] [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117] [<ffffffff810b5385>] ? shrink_slab+0xdb/0x15f
[ 3433.449117] [<ffffffff810b69bc>] ? kswapd+0x574/0x96a
[ 3433.449117] [<ffffffff810b6448>] ? kswapd+0x0/0x96a
[ 3433.449117] [<ffffffff810714e2>] ? kthread+0x7d/0x85
[ 3433.449117] [<ffffffff8102d6d4>] ? kernel_thread_helper+0x4/0x10
[ 3433.449117] [<ffffffff81569200>] ? restore_args+0x0/0x30
[ 3433.449117] [<ffffffff81071465>] ? kthread+0x0/0x85
[ 3433.449117] [<ffffffff8102d6d0>] ? kernel_thread_helper+0x0/0x10

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# c0aa4caf 28-Feb-2011 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Implement syncfs 9P operation

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# f735195d 16-Feb-2011 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

[net/9p] Small non-IO PDUs for zero-copy supporting transports.

If a transport prefers payload to be sent separate from the PDU
(P9_TRANS_PREF_PAYLOAD_SEP), there is no need to allocate msize
PDU buffers(struct p9_fcall).

This patch allocates only upto 4k buffers for this kind of transports
and there won't be any change to the legacy transports.

Hence, this patch on top of zero copy changes allows user to
specify higher msizes through the mount option
without hogging the kernel heap.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# ca41bb3e 01-Feb-2011 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

[net/9p] Handle Zero Copy TREAD/RERROR case in !dotl case.

This takes care of copying out error buffers from user buffer
payloads when we are using zero copy. This happens because the
only payload buffer the server has to respond to the request is
the user buffer given for the zero copy read.

Because we only use zerocopy when the amount of data to transfer
is greater than a certain size (currently 4K) and error strings are
limited to ERRMAX (currently 128) we don't need to worry about there
being sufficient space for the error to fit in the payload.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 2c66523f 16-Feb-2011 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

[net/9p] readdir zerocopy changes for 9P2000.L protocol.

Modify p9_client_readdir() to check the transport preference and act according
If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload
separately instead of putting it directly on PDU.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 1fc52481 13-Feb-2011 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

[net/9p] Write side zerocopy changes for 9P2000.L protocol.

Modify p9_client_write() to check the transport preference and act accordingly.
If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload
separately instead of putting it directly on PDU.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# bb2f8a55 28-Jan-2011 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

[net/9p] Read side zerocopy changes for 9P2000.L protocol.

Modify p9_client_read() to check the transport preference and act accordingly.
If the preference is P9_TRANS_PREF_PAYLOAD_SEP, send the payload
separately instead of putting it directly on PDU.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# f6ac55b6 26-Oct-2010 Sanchit Garg <sancgarg@linux.vnet.ibm.com>

net/9p: Return error on read with NULL buffer

This patch ensures that a read(fd, NULL, 10) returns EFAULT on a 9p file.

Signed-off-by: Sanchit Garg <sancgarg@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# b165d601 22-Oct-2010 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

9p: Add datasync to client side TFSYNC/RFSYNC for dotl

SYNOPSIS
size[4] Tfsync tag[2] fid[4] datasync[4]

size[4] Rfsync tag[2]

DESCRIPTION

The Tfsync transaction transfers ("flushes") all modified in-core data of
file identified by fid to the disk device (or other permanent storage
device) where that file resides.

If datasync flag is specified data will be fleshed but does not flush
modified metadata unless that metadata is needed in order to allow a
subsequent data retrieval to be correctly handled.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 7b3bb3fe 18-Oct-2010 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Return error if we fail to encode protocol data

We need to return error in case we fail to encode data in protocol buffer.
This patch also return error in case of a failed copy_from_user.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 52f44e0d 29-Sep-2010 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

net/9p: Add waitq to VirtIO transport.

If there is not enough space for the PDU on the VirtIO ring, current
code returns -EIO propagating the error to user.

This patch introduced a wqit_queue on the channel, and lets the process
wait on this queue until VirtIO ring frees up.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 329176cc 28-Sep-2010 M. Mohan Kumar <mohan@in.ibm.com>

9p: Implement TREADLINK operation for 9p2000.L

Synopsis

size[4] TReadlink tag[2] fid[4]
size[4] RReadlink tag[2] target[s]

Description
Readlink is used to return the contents of the symoblic link
referred by fid. Contents of symboic link is returned as a
response.

target[s] - Contents of the symbolic link referred by fid.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 1d769cd1 26-Sep-2010 M. Mohan Kumar <mohan@in.ibm.com>

9p: Implement TGETLOCK

Synopsis

size[4] TGetlock tag[2] fid[4] getlock[n]
size[4] RGetlock tag[2] getlock[n]

Description

TGetlock is used to test for the existence of byte range posix locks on a file
identified by given fid. The reply contains getlock structure. If the lock could
be placed it returns F_UNLCK in type field of getlock structure. Otherwise it
returns the details of the conflicting locks in the getlock structure

getlock structure:
type[1] - Type of lock: F_RDLCK, F_WRLCK
start[8] - Starting offset for lock
length[8] - Number of bytes to check for the lock
If length is 0, check for lock in all bytes starting at the location
'start' through to the end of file
pid[4] - PID of the process that wants to take lock/owns the task
in case of reply
client[4] - Client id of the system that owns the process which
has the conflicting lock

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# a099027c 27-Sep-2010 M. Mohan Kumar <mohan@in.ibm.com>

9p: Implement TLOCK

Synopsis

size[4] TLock tag[2] fid[4] flock[n]
size[4] RLock tag[2] status[1]

Description

Tlock is used to acquire/release byte range posix locks on a file
identified by given fid. The reply contains status of the lock request

flock structure:
type[1] - Type of lock: F_RDLCK, F_WRLCK, F_UNLCK
flags[4] - Flags could be either of
P9_LOCK_FLAGS_BLOCK - Blocked lock request, if there is a
conflicting lock exists, wait for that lock to be released.
P9_LOCK_FLAGS_RECLAIM - Reclaim lock request, used when client is
trying to reclaim a lock after a server restrart (due to crash)
start[8] - Starting offset for lock
length[8] - Number of bytes to lock
If length is 0, lock all bytes starting at the location 'start'
through to the end of file
pid[4] - PID of the process that wants to take lock
client_id[4] - Unique client id

status[1] - Status of the lock request, can be
P9_LOCK_SUCCESS(0), P9_LOCK_BLOCKED(1), P9_LOCK_ERROR(2) or
P9_LOCK_GRACE(3)
P9_LOCK_SUCCESS - Request was successful
P9_LOCK_BLOCKED - A conflicting lock is held by another process
P9_LOCK_ERROR - Error while processing the lock request
P9_LOCK_GRACE - Server is in grace period, it can't accept new lock
requests in this period (except locks with
P9_LOCK_FLAGS_RECLAIM flag set)

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 920e65dc 22-Sep-2010 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

[9p] Introduce client side TFSYNC/RFSYNC for dotl.

SYNOPSIS
size[4] Tfsync tag[2] fid[4]

size[4] Rfsync tag[2]

DESCRIPTION

The Tfsync transaction transfers ("flushes") all modified in-core data of
file identified by fid to the disk device (or other permanent storage
device) where that file resides.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 8e44a080 25-Aug-2010 jvrao <jvrao@linux.vnet.ibm.com>

net/9p: Add a Warning to catch NULL fids passed to p9_client_clunk().

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 4f7ebe80 28-Jul-2010 Arun R Bharadwaj <arun@linux.vnet.ibm.com>

net/9p: This patch implements TLERROR/RLERROR on the 9P client.

Signed-off-by: Arun R Bharadwaj <arun@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 32a875ad 19-Oct-2010 stephen hemminger <shemminger@vyatta.com>

9p: client code cleanup

Make p9_client_version static since only used in one file.
Remove p9_client_auth because it is defined but never used.
Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a02cec21 22-Sep-2010 Eric Dumazet <eric.dumazet@gmail.com>

net: return operator cleanup

Change "return (EXPR);" to "return EXPR;"

return is not a function, parentheses are not required.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 62b2be59 24-Aug-2010 Latchesar Ionkov <lionkov@gmail.com>

fs/9p, net/9p: memory leak fixes

Four memory leak fixes in the 9P code.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 0b1208b1 01-Jul-2010 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

fs/9p: destroy fid on failed remove

9P spec says:
"It is correct to consider remove to be a clunk with the
side effect of removing the file if permissions allow. "

So even if remove fails we need to destroy the fid.

Without this patch an rmdir on a directory with contents leave
the new cloned directory fid fid attached to fidlist. On umount
we dump the fids on the fidlist

~# rmdir /mnt2/test4/
rmdir: failed to remove `/mnt2/test4/': Directory not empty
~# umount /mnt2/
~# dmesg
[ 228.474323] Found fid 3 not clunked

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# eda25e46 31-May-2010 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Implement TXATTRCREATE 9p call

TXATTRCREATE: Prepare a fid for setting xattr value on a file system object.

size[4] TXATTRCREATE tag[2] fid[4] name[s] attr_size[8] flags[4]
size[4] RXATTRCREATE tag[2]

txattrcreate gets a fid pointing to xattr. This fid can later be
used to set the xattr value.

flag value is derived from set Linux setxattr. The manpage says
"The flags parameter can be used to refine the semantics of the operation.
XATTR_CREATE specifies a pure create, which fails if the named attribute
exists already. XATTR_REPLACE specifies a pure replace operation, which
fails if the named attribute does not already exist. By default (no flags),
the extended attribute will be created if need be, or will simply replace
the value if the attribute exists."

The actual setxattr operation happens when the fid is clunked. At that point
the written byte count and the attr_size specified in TXATTRCREATE should be
same otherwise an error will be returned.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 0ef63f34 31-May-2010 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Implement attrwalk 9p call

TXATTRWALK: Descend a ATTR namespace

size[4] TXATTRWALK tag[2] fid[4] newfid[4] name[s]
size[4] RXATTRWALK tag[2] size[8]

txattrwalk gets a fid pointing to xattr. This fid can later be
used to read the xattr value. If name is NULL the fid returned
can be used to get the list of extended attribute associated to
the file system object.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# ef56547e 22-Jun-2010 M. Mohan Kumar <mohan@in.ibm.com>

9p: Implement LOPEN

Implement 9p2000.L version of open(LOPEN) interface in 9p client.

For LOPEN, no need to convert the flags to and from 9p mode to VFS mode.

Synopsis:

size[4] Tlopen tag[2] fid[4] mode[4]

size[4] Rlopen tag[2] qid[13] iounit[4]

[Fix mode bit format - jvrao@linux.vnet.ibm.com]

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbegren <ericvh@gmail.com>


# 5643135a 17-Jun-2010 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

fs/9p: This patch implements TLCREATE for 9p2000.L protocol.

SYNOPSIS

size[4] Tlcreate tag[2] fid[4] name[s] flags[4] mode[4] gid[4]

size[4] Rlcreate tag[2] qid[13] iounit[4]

DESCRIPTION

The Tlreate request asks the file server to create a new regular file with the
name supplied, in the directory (dir) represented by fid.
The mode argument specifies the permissions to use. New file is created with
the uid if the fid and with supplied gid.

The flags argument represent Linux access mode flags with which the caller
is requesting to open the file with. Protocol allows all the Linux access
modes but it is upto the server to allow/disallow any of these acess modes.
If the server doesn't support any of the access mode, it is expected to
return error.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 01a622bd 16-Jun-2010 M. Mohan Kumar <mohan@in.ibm.com>

9p: Implement TMKDIR

Implement TMKDIR as part of 2000.L Work

Synopsis

size[4] Tmkdir tag[2] fid[4] name[s] mode[4] gid[4]

size[4] Rmkdir tag[2] qid[13]

Description

mkdir asks the file server to create a directory with given name,
mode and gid. The qid for the new directory is returned with
the mkdir reply message.

Note: 72 is selected as the opcode for TMKDIR from the reserved list.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 4b43516a 16-Jun-2010 M. Mohan Kumar <mohan@in.ibm.com>

9p: Implement TMKNOD

Synopsis

size[4] Tmknod tag[2] fid[4] name[s] mode[4] major[4] minor[4] gid[4]

size[4] Rmknod tag[2] qid[13]

Description

mknod asks the file server to create a device node with given major and
minor number, mode and gid. The qid for the new device node is returned
with the mknod reply message.

[sripathik@in.ibm.com: Fix error handling code]

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 50cc42ff 09-Jun-2010 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

9p: Define and implement TSYMLINK for 9P2000.L

Create a symbolic link

SYNOPSIS

size[4] Tsymlink tag[2] fid[4] name[s] symtgt[s] gid[4]

size[4] Rsymlink tag[2] qid[13]

DESCRIPTION

Create a symbolic link named 'name' pointing to 'symtgt'.
gid represents the effective group id of the caller.
The permissions of a symbolic link are irrelevant hence it is omitted
from the protocol.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Reviewed-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 652df9a7 03-Jun-2010 Venkateswararao Jujjuri (JV) <jvrao@linux.vnet.ibm.com>

9p: Define and implement TLINK for 9P2000.L

This patch adds a helper function to get the dentry from inode and
uses it in creating a Hardlink

SYNOPSIS

size[4] Tlink tag[2] dfid[4] oldfid[4] newpath[s]

size[4] Rlink tag[2]

DESCRIPTION

Create a link 'newpath' in directory pointed by dfid linking to oldfid path.

[sripathik@in.ibm.com : p9_client_link should not free req structure
if p9_client_rpc has returned an error.]

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 87d7845a 18-Jun-2010 Sripathi Kodi <sripathik@in.ibm.com>

9p: Implement client side of setattr for 9P2000.L protocol.

SYNOPSIS

size[4] Tsetattr tag[2] attr[n]

size[4] Rsetattr tag[2]

DESCRIPTION

The setattr command changes some of the file status information.
attr resembles the iattr structure used in Linux kernel. It
specifies which status parameter is to be changed and to what
value. It is laid out as follows:

valid[4]
specifies which status information is to be changed. Possible
values are:
ATTR_MODE (1 << 0)
ATTR_UID (1 << 1)
ATTR_GID (1 << 2)
ATTR_SIZE (1 << 3)
ATTR_ATIME (1 << 4)
ATTR_MTIME (1 << 5)
ATTR_ATIME_SET (1 << 7)
ATTR_MTIME_SET (1 << 8)

The last two bits represent whether the time information
is being sent by the client's user space. In the absense
of these bits the server always uses server's time.

mode[4]
File permission bits

uid[4]
Owner id of file

gid[4]
Group id of the file

size[8]
File size

atime_sec[8]
Time of last file access, seconds

atime_nsec[8]
Time of last file access, nanoseconds

mtime_sec[8]
Time of last file modification, seconds

mtime_nsec[8]
Time of last file modification, nanoseconds

Explanation of the patches:
--------------------------

*) The kernel just copies relevent contents of iattr structure to
p9_iattr_dotl structure and passes it down to the client. The
only check it has is calling inode_change_ok()
*) The p9_iattr_dotl structure does not have ctime and ia_file
parameters because I don't think these are needed in our case.
The client user space can request updating just ctime by calling
chown(fd, -1, -1). This is handled on server side without a need
for putting ctime on the wire.
*) The server currently supports changing mode, time, ownership and
size of the file.
*) 9P RFC says "Either all the changes in wstat request happen, or
none of them does: if the request succeeds, all changes were made;
if it fails, none were."
I have not done anything to implement this specifically because I
don't see a reason.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# f0853122 12-Jul-2010 Sripathi Kodi <sripathik@in.ibm.com>

9p: getattr client implementation for 9P2000.L protocol.

SYNOPSIS

size[4] Tgetattr tag[2] fid[4] request_mask[8]

size[4] Rgetattr tag[2] lstat[n]

DESCRIPTION

The getattr transaction inquires about the file identified by fid.
request_mask is a bit mask that specifies which fields of the
stat structure is the client interested in.

The reply will contain a machine-independent directory entry,
laid out as follows:

st_result_mask[8]
Bit mask that indicates which fields in the stat structure
have been populated by the server

qid.type[1]
the type of the file (directory, etc.), represented as a bit
vector corresponding to the high 8 bits of the file's mode
word.

qid.vers[4]
version number for given path

qid.path[8]
the file server's unique identification for the file

st_mode[4]
Permission and flags

st_uid[4]
User id of owner

st_gid[4]
Group ID of owner

st_nlink[8]
Number of hard links

st_rdev[8]
Device ID (if special file)

st_size[8]
Size, in bytes

st_blksize[8]
Block size for file system IO

st_blocks[8]
Number of file system blocks allocated

st_atime_sec[8]
Time of last access, seconds

st_atime_nsec[8]
Time of last access, nanoseconds

st_mtime_sec[8]
Time of last modification, seconds

st_mtime_nsec[8]
Time of last modification, nanoseconds

st_ctime_sec[8]
Time of last status change, seconds

st_ctime_nsec[8]
Time of last status change, nanoseconds

st_btime_sec[8]
Time of creation (birth) of file, seconds

st_btime_nsec[8]
Time of creation (birth) of file, nanoseconds

st_gen[8]
Inode generation

st_data_version[8]
Data version number

request_mask and result_mask bit masks contain the following bits
#define P9_STATS_MODE 0x00000001ULL
#define P9_STATS_NLINK 0x00000002ULL
#define P9_STATS_UID 0x00000004ULL
#define P9_STATS_GID 0x00000008ULL
#define P9_STATS_RDEV 0x00000010ULL
#define P9_STATS_ATIME 0x00000020ULL
#define P9_STATS_MTIME 0x00000040ULL
#define P9_STATS_CTIME 0x00000080ULL
#define P9_STATS_INO 0x00000100ULL
#define P9_STATS_SIZE 0x00000200ULL
#define P9_STATS_BLOCKS 0x00000400ULL

#define P9_STATS_BTIME 0x00000800ULL
#define P9_STATS_GEN 0x00001000ULL
#define P9_STATS_DATA_VERSION 0x00002000ULL

#define P9_STATS_BASIC 0x000007ffULL
#define P9_STATS_ALL 0x00003fffULL

This patch implements the client side of getattr implementation for
9P2000.L. It introduces a new structure p9_stat_dotl for getting
Linux stat information along with QID. The data layout is similar to
stat structure in Linux user space with the following major
differences:

inode (st_ino) is not part of data. Instead qid is.

device (st_dev) is not part of data because this doesn't make sense
on the client.

All time variables are 64 bit wide on the wire. The kernel seems to use
32 bit variables for these variables. However, some of the architectures
have used 64 bit variables and glibc exposes 64 bit variables to user
space on some architectures. Hence to be on the safer side we have made
these 64 bit in the protocol. Refer to the comments in
include/asm-generic/stat.h

There are some additional fields: st_btime_sec, st_btime_nsec, st_gen,
st_data_version apart from the bitmask, st_result_mask. The bit mask
is filled by the server to indicate which stat fields have been
populated by the server. Currently there is no clean way for the
server to obtain these additional fields, so it sends back just the
basic fields.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbegren <ericvh@gmail.com>


# 69d4b443 01-Jun-2010 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

net/9p: Handle the server returned error properly

We need to get the negative errno value in the kernel
even for dotl.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 7751bdb3 04-Jun-2010 Sripathi Kodi <sripathik@in.ibm.com>

9p: readdir implementation for 9p2000.L

This patch implements the kernel part of readdir() implementation for 9p2000.L

Change from V3: Instead of inode, server now sends qids for each dirent

SYNOPSIS

size[4] Treaddir tag[2] fid[4] offset[8] count[4]
size[4] Rreaddir tag[2] count[4] data[count]

DESCRIPTION

The readdir request asks the server to read the directory specified by 'fid'
at an offset specified by 'offset' and return as many dirent structures as
possible that fit into count bytes. Each dirent structure is laid out as
follows.

qid.type[1]
the type of the file (directory, etc.), represented as a bit
vector corresponding to the high 8 bits of the file's mode
word.

qid.vers[4]
version number for given path

qid.path[8]
the file server's unique identification for the file

offset[8]
offset into the next dirent.

type[1]
type of this directory entry.

name[256]
name of this directory entry.

This patch adds v9fs_dir_readdir_dotl() as the readdir() call for 9p2000.L.
This function sends P9_TREADDIR command to the server. In response the server
sends a buffer filled with dirent structures. This is different from the
existing v9fs_dir_readdir() call which receives stat structures from the server.
This results in significant speedup of readdir() on large directories.
For example, doing 'ls >/dev/null' on a directory with 10000 files on my
laptop takes 1.088 seconds with the existing code, but only takes 0.339 seconds
with the new readdir.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 4681dbda 24-Mar-2010 Sripathi Kodi <sripathik@in.ibm.com>

9p: add 9P2000.L rename operation

I made a V2 of this patch on top of my patches for VFS switches.
All the changes were due to change in some offsets.

rename - change name of file or directory

size[4] Trename tag[2] fid[4] newdirfid[4] name[s]
size[4] Rrename tag[2]

The rename message is used to change the name of a file, possibly moving it
to a new directory. The 9P wstat message can only rename a file within the
same directory.

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# bda8e775 24-Mar-2010 Sripathi Kodi <sripathik@in.ibm.com>

9p: add 9P2000.L statfs operation

I made a V2 of this patch on top of my patches for VFS switches. The
change was adding v9fs_statfs pointer to v9fs_super_ops_dotl
instead of v9fs_super_ops.

statfs - get file system statistics

size[4] Tstatfs tag[2] fid[4]
size[4] Rstatfs tag[2] type[4] bsize[4] blocks[8] bfree[8] bavail[8]
files[8] ffree[8] fsid[8] namelen[4]

The statfs message is used to request file system information returned
by the statfs(2) system call, which is used by df(1) to report file
system and disk space usage.

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# c56e4acf 24-Mar-2010 Sripathi Kodi <sripathik@in.ibm.com>

9p: VFS switches for 9p2000.L: protocol and client changes

Prepare p9pdu_read/write functions to handle multiple protocols.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 3dc9fef6 05-Apr-2010 Dan Carpenter <error27@gmail.com>

9p: saving negative to unsigned char

Saving -EINVAL as unsigned char truncates the high bits and changes it
into 234 instead of -22. This breaks the test for "if (ret == -EINVAL)"
in parse_opts().

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 6d96d3ab 29-Mar-2010 Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

9p: Make sure we are able to clunk the cached fid on umount

dcache prune happen on umount. So we cannot mark the client
satus disconnect. That will prevent a 9p call to the server

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 5a0e3ad6 24-Mar-2010 Tejun Heo <tj@kernel.org>

include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>


# 45bc21ed 08-Mar-2010 Sripathi Kodi <sripathik@in.ibm.com>

9p: Change the name of new protocol from 9p2010.L to 9p2000.L

This patch changes the name of the new 9P protocol from 9p2010.L to
9p2000.u. This is because we learnt that the name 9p2010 is already
being used by others.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# c5a7697d 05-Mar-2010 Sripathi Kodi <sripathik@in.ibm.com>

9P2010.L handshake: .L protocol negotiation

This patch adds 9P2010.L protocol negotiation with the server

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 342fee1d 05-Mar-2010 Sripathi Kodi <sripathik@in.ibm.com>

9P2010.L handshake: Remove "dotu" variable

Removes 'dotu' variable and make everything dependent
on 'proto_version' field.

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 0fb80abd 05-Mar-2010 Sripathi Kodi <sripathik@in.ibm.com>

9P2010.L handshake: Add mount option

Add new mount V9FS mount option to specify protocol version

This patch adds a new mount option to specify protocol version.
With this option it is possible to use "-o version=" switch to
specify 9P protocol version to use. Valid options for version
are:
9p2000
9p2000.u
9p2010.L

Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 8781ff94 08-Feb-2010 Eric Van Hensbergen <ericvh@gmail.com>

9p: fix p9_client_destroy unconditional calling v9fs_put_trans

restructure client create code to handle error cases better and
only cleanup initialized portions of the stack.

Signed-off-by: Venkateswararao Jujjuri <jvrao@us.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# d8c8a9e3 08-Feb-2010 Eric Van Hensbergen <ericvh@gmail.com>

9p: fix option parsing

Options pointer is being moved before calling kfree() which seems
to cause problems. This uses a separate pointer to track and free
original allocation.

Signed-off-by: Venkateswararao Jujjuri <jvrao@us.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>w


# 9d6939da 15-Jan-2010 Eric Van Hensbergen <ericvh@gmail.com>

net/9p: fix statsize inside twstat

stat structures contain a size prefix. In our twstat messages
we were including the size of the size prefix in the prefix, which is not
what the protocol wants, and Inferno servers would complain.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 349d3bb8 15-Jan-2010 Eric Van Hensbergen <ericvh@gmail.com>

net/9p: fail when user specifies a transport which we can't find

If the user specifies a transport and we can't find it, we failed back
to the default trainsport silently. This patch will make the code
complain more loudly and return an error code.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 3e2796a9 02-Nov-2009 Eric Van Hensbergen <ericvh@gmail.com>

9p: fix readdir corner cases

The patch below also addresses a couple of other corner cases in readdir
seen with a large (e.g. 64k) msize. I'm not sure what people think of
my co-opting of fid->aux here. I'd be happy to rework if there's a better
way.

When the size of the user supplied buffer passed to readdir is smaller
than the data returned in one go by the 9P read request, v9fs_dir_readdir()
currently discards extra data so that, on the next call, a 9P read
request will be issued with offset < previous offset + bytes returned,
which voilates the constraint described in paragraph 3 of read(5) description.
This patch preseves the leftover data in fid->aux for use in the next call.

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 0aad37ef 17-Aug-2009 Abhishek Kulkarni <adkulkar@umail.iu.edu>

net/9p: insulate the client against an invalid error code sent by a 9p server

A looney tunes server sending an invalid error code (which is !IS_ERR_VALUE)
can result in a client oops. So fix it by adding a check and converting unknown
or invalid error codes to -ESERVERFAULT.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 0e15597e 19-Jul-2009 Abhishek Kulkarni <adkulkar@umail.iu.edu>

9p: minor comment fixes

Fix the comments -- mostly the improper and/or missing descriptions
of function parameters.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# eedfe1c4 14-Jul-2009 Abhishek Kulkarni <adkulkar@umail.iu.edu>

9p: Possible regression in p9_client_stat

Fix a possible regression with p9_client_stat where it can try to kfree
an ERR_PTR after an erroneous p9pdu_readf. Also remove an unnecessary data
buffer increment in p9_client_read.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# a17d1720 14-Jul-2009 Abhishek Kulkarni <adkulkar@umail.iu.edu>

9p: default 9p transport module fix

The default 9p transport module is not chosen unless an option parameter (any)
is passed to mount, which thus returns a ENOPROTOSUPPORT. This fix moves the
check out of parse_opts into p9_client_create.

Signed-off-by: Abhishek Kulkarni <adkulkar@umail.iu.edu>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 1bab88b2 05-Apr-2009 Latchesar Ionkov <lucho@ionkov.net>

net/9p: handle correctly interrupted 9P requests

Currently the 9p code crashes when a operation is interrupted, i.e. for
example when the user presses ^C while reading from a file.

This patch fixes the code that is responsible for interruption and flushing
of 9P operations.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>


# 742b11a7 05-Apr-2009 Latchesar Ionkov <lucho@ionkov.net>

net/9p: return error when p9_client_stat fails

p9_client_stat function doesn't return correct value if it fails.
p9_client_stat should return ERR_PTR of the error value when it fails.
Instead, it always returns a value to the allocated p9_wstat struct even
when it is not populated correctly.

This patch makes p9_client_stat to handle failure correctly.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Reviewed-by: Eric Van Hensbergen <ericvh@gmail.com>


# 453ed90d 05-Apr-2009 Latchesar Ionkov <lucho@ionkov.net>

net/9p: set correct stat size when sending Twstat messages

The 9P2000 Twstat message requires the size of the stat structure to be
specified. Currently the 9p code writes zero instead of the actual size.
This behavior confuses some of the file servers that check if the size is
correct.

This patch adds a new function that calculcates the stat size and puts the
value in the appropriate place in the 9P message.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Reviewed-by: Eric Van Hensbergen <ericvh@gmail.com>


# 24e94de4 18-Jan-2009 Roel Kluin <roel.kluin@gmail.com>

net/9p: fid->fid is used uninitialized

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f8b9d53a 13-Nov-2008 David Howells <dhowells@redhat.com>

CRED: Wrap task credential accesses in 9P2000 filesystem

Wrap access to task credentials so that they can be separated more easily from
the task_struct during the introduction of COW creds.

Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
sense to use RCU directly rather than a convenient wrapper; these will be
addressed by later patches.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Reviewed-by: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Ron Minnich <rminnich@sandia.gov>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: v9fs-developer@lists.sourceforge.net
Signed-off-by: James Morris <jmorris@namei.org>


# b0d5fdef 04-Nov-2008 Randy Dunlap <randy.dunlap@oracle.com>

net/9p: fix printk format warnings

Fix printk format warnings in net/9p.
Built cleanly on 7 arches.

net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 9f3e9bbe 28-Oct-2008 Roel Kluin <roel.kluin@gmail.com>

unsigned fid->fid cannot be negative

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 45abdf1c 23-Oct-2008 Tom Tucker <tom@opengridcomputing.com>

p9: Fix leak of waitqueue in request allocation path

If a T or R fcall cannot be allocated, the function returns an error
but neglects to free the wait queue that was successfully allocated.

If it comes through again a second time this wq will be overwritten
with a new allocation and the old allocation will be leaked.

Also, if the client is subsequently closed, the close path will
attempt to clean up these allocations, so set the req fields to
NULL to avoid duplicate free.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 82b189ea 23-Oct-2008 Tom Tucker <tom@opengridcomputing.com>

9p: Remove unneeded free of fcall for Flush

T and R fcall are reused until the client is destroyed. There does
not need to be a special case for Flush

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# cac23d65 23-Oct-2008 Tom Tucker <tom@opengridcomputing.com>

9p: Make all client spin locks IRQ safe

The client lock must be IRQ safe. Some of the lock acquisition paths
took regular spin locks.

Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# b22cecdd 05-Nov-2008 Randy Dunlap <randy.dunlap@oracle.com>

net/9p: fix printk format warnings

Fix printk format warnings in net/9p.
Built cleanly on 7 arches.

net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e45c5405 22-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: fix sparse warnings

Several sparse warnings were introduced by patches accepted during the merge
window which weren't caught. This patch fixes those warnings.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# e7f4b8f1 17-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: Improve debug support

The new debug support lacks some of the information that the previous fcprint
code provided -- this patch focuses on better presentation of debug data along
with more helpful debug along error paths.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 51a87c55 16-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: rework client code to use new protocol support functions

Now that the new protocol functions are in place, this patch switches
the client code to using the new support code.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 51d71f9f 16-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: remove 9p fcall debug prints

One of the current debug options allows users to get a verbose dump of fcalls.
This isn't really necessary as correctly parsed protocol frames can be printed
as part of the code in the client functions. The consolidated printfcalls
structure would require new entries to be added for every extension. This
patch removes the debug print methods and their use.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 6936bf60 13-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: encapsulate version function

Alsmot all 9P client wire functions have their own (set of) functions.
Tversion is an exception as its encapsulated into the client_create code.

This patch moves the protocol specifics of this to a function to match the
rest of the code.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 06b55b46 13-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: move dirread to fs layer

Currently reading a directory is implemented in the client code.
This function is not actually a wire operation, but a meta operation
which calls read operations and processes the results.

This patch moves this functionality to the fs layer and calls component
wire operations instead of constructing their packets. This provides a
cleaner separation and will help when we reorganize the client functions
and protocol processing methods.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# fbedadc1 13-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: move readn meta-function from client to fs layer

There are a couple of methods in the client code which aren't actually
wire operations. To keep things organized cleaner, these operations are
being moved to the fs layer.

This patch moves the readn meta-function (which executes multiple wire
reads until a buffer is full) to the fs layer.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 0fc9655e 13-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: consolidate read/write functions

Currently there are two separate versions of read and write. One for
dealing with user buffers and the other for dealing with kernel buffers.
There is a tremendous amount of code duplication in the otherwise
identical versions of these functions. This patch adds an additional
user buffer parameter to read and write and conditionalizes handling of
the buffer on whether the kernel buffer or the user buffer is populated.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 91b8534f 13-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: make rpc code common and rework flush code

This code moves the rpc function to the common client base,
reorganizes the flush code to be more simple and stable, and
makes the necessary adjustments to the underlying transports
to adapt to the new structure.

This reduces the overall amount of code duplication between the
transports and should make adding new transports more straightforward.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 673d62cd 13-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: apply common request code to trans_fd

Apply the now common p9_req_t structure to the fd transport.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# fea511a6 13-Oct-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: move request management to client code

The virtio transport uses a simplified request management system
that I want to use for all transports. This patch adapts and moves the
exisiting code for managing requests to the client common code.
Later patches will apply these mechanisms to the other transports.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 5503ac56 13-Oct-2008 Eric Van Hensbergen <ericvh@ericvh-desktop.austin.ibm.com>

9p: remove unnecessary prototypes

Cleanup files by reordering functions in order to remove need for
unnecessary function prototypes.

There are no code changes here, just functions being moved around and
prototypes being eliminated.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 8b81ef58 13-Oct-2008 Eric Van Hensbergen <ericvh@ericvh-desktop.austin.ibm.com>

9p: consolidate transport structure

Right now there is a transport module structure which provides per-transport
type functions and data and a transport structure which contains per-instance
public data as well as function pointers to instance specific functions.

This patch moves public transport visible instance data to the client
structure (which in some cases had duplicate data) and consolidates the
functions into the transport module structure.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# a447c093 13-Oct-2008 Steven Whitehouse <swhiteho@redhat.com>

vfs: Use const for kernel parser table

This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.

This was posted for review some time ago and I believe its been in -mm
since then.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 72029fe8 24-Sep-2008 Tejun Heo <tj@kernel.org>

9p: implement proper trans module refcounting and unregistration

9p trans modules aren't refcounted nor were they unregistered
properly. Fix it.

* Add 9p_trans_module->owner and reference the module on each trans
instance creation and put it on destruction.

* Protect v9fs_trans_list with a spinlock. This isn't strictly
necessary as the list is manipulated only during module loading /
unloading but it's a good idea to make the API safe.

* Unregister trans modules when the corresponding module is being
unloaded.

* While at it, kill unnecessary EXPORT_SYMBOL on p9_trans_fd_init().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 728fc4ef 07-Mar-2008 Josef 'Jeff' Sipek <jeffpc@josefsipek.net>

9p: Correct fidpool creation failure in p9_client_create

On error, p9_idpool_create returns an ERR_PTR-encoded errno.

Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>


# bb8ffdfc 07-Mar-2008 Eric Van Hensbergen <ericvh@ericvh-desktop.(none)>

9p: propagate parse_option changes to client and transports

Propagate changes that were made to the parse_options code to the
other parse options pieces present in the other modules. Looks like
the client parse options was probably corrupting the parse string
and causing problems for others.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 8a0dc95f 06-Feb-2008 Eric Van Hensbergen <ericvh@opteron.homeip.net>

9p: transport API reorganization

This merges the mux.c (including the connection interface) with trans_fd
in preparation for transport API changes. Ultimately, trans_fd will need
to be rewritten to clean it up and simplify the implementation, but this
reorganization is viewed as the first step.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# e2735b77 06-Feb-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: block-based virtio client

This replaces the console-based virto client with a block-based
client using a single request queue.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 043aba40 06-Feb-2008 Eric Van Hensbergen <ericvh@gmail.com>

9p: create transport rpc cut-thru

Add a new transport function which allows a cut-thru directly to
the transport instead of processing request through the mux if the
cut-thru exists.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# afcf0c13 05-Feb-2008 Martin Stava <martin.stava@gmail.com>

9p: fix bug in p9_clone_stat

This patch fixes a bug in the copying of 9P
stat information where string references
weren't being updated properly.

Signed-off-by: Martin Sava <martin.stava@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# fb0466c3 17-Oct-2007 Eric Van Hensbergen <ericvh@ericvh-laptop.(none)>

9p: fix bad kconfig cross-dependency

This patch moves transport dynamic registration and matching to the net
module to prevent a bad Kconfig dependency between the net and fs 9p modules.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# ba17674f 17-Oct-2007 Latchesar Ionkov <lucho@ionkov.net>

9p: attach-per-user

The 9P2000 protocol requires the authentication and permission checks to be
done in the file server. For that reason every user that accesses the file
server tree has to authenticate and attach to the server separately.
Multiple users can share the same connection to the server.

Currently v9fs does a single attach and executes all I/O operations as a
single user. This makes using v9fs in multiuser environment unsafe as it
depends on the client doing the permission checking.

This patch improves the 9P2000 support by allowing every user to attach
separately. The patch defines three modes of access (new mount option
'access'):

- attach-per-user (access=user) (default mode for 9P2000.u)
If a user tries to access a file served by v9fs for the first time, v9fs
sends an attach command to the server (Tattach) specifying the user. If
the attach succeeds, the user can access the v9fs tree.
As there is no uname->uid (string->integer) mapping yet, this mode works
only with the 9P2000.u dialect.

- allow only one user to access the tree (access=<uid>)
Only the user with uid can access the v9fs tree. Other users that attempt
to access it will get EPERM error.

- do all operations as a single user (access=any) (default for 9P2000)
V9fs does a single attach and all operations are done as a single user.
If this mode is selected, the v9fs behavior is identical with the current
one.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# a80d923e 17-Oct-2007 Eric Van Hensbergen <ericvh@opteron.(none)>

9p: Make transports dynamic

This patch abstracts out the interfaces to underlying transports so that
new transports can be added as modules. This should also allow kernel
configuration of transports without ifdef-hell.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# 0af8887e 13-Jul-2007 Eric Van Hensbergen <ericvh@ericvh-desktop.austin.ibm.com>

9p: fix a race condition bug in umount which caused a segfault

umounting partitions after heavy activity would sometimes trigger a
segmentation violation. This fix appears to remove that problem.
Fix originally provided by Latchesar Ionkov.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>


# bd238fb4 10-Jul-2007 Latchesar Ionkov <lucho@ionkov.net>

9p: Reorganization of 9p file system code

This patchset moves non-filesystem interfaces of v9fs from fs/9p to net/9p.
It moves the transport, packet marshalling and connection layers to net/9p
leaving only the VFS related files in fs/9p. This work is being done in
preparation for in-kernel 9p servers as well as alternate 9p clients (other
than VFS).

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>