History log of /linux-master/fs/ocfs2/cluster/tcp.c
Revision Date Author Comments
# e52828cc 23-Jun-2023 David Howells <dhowells@redhat.com>

ocfs2: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()

Switch ocfs2 from using sendpage() to using sendmsg() + MSG_SPLICE_PAGES so
that sendpage can be phased out.

Signed-off-by: David Howells <dhowells@redhat.com>

cc: Mark Fasheh <mark@fasheh.com>
cc: Joel Becker <jlbec@evilplan.org>
cc: Joseph Qi <joseph.qi@linux.alibaba.com>
cc: ocfs2-devel@oss.oracle.com
Link: https://lore.kernel.org/r/20230623225513.2732256-15-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 86d7bd6e 23-Jun-2023 David Howells <dhowells@redhat.com>

ocfs2: Fix use of slab data with sendpage

ocfs2 uses kzalloc() to allocate buffers for o2net_hand, o2net_keep_req and
o2net_keep_resp and then passes these to sendpage. This isn't really
allowed as the lifetime of slab objects is not controlled by page ref -
though in this case it will probably work. sendmsg() with MSG_SPLICE_PAGES
will, however, print a warning and give an error.

Fix it to use folio_alloc() instead to allocate a buffer for the handshake
message, keepalive request and reply messages.

Fixes: 98211489d414 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Mark Fasheh <mark@fasheh.com>
cc: Kurt Hackel <kurt.hackel@oracle.com>
cc: Joel Becker <jlbec@evilplan.org>
cc: Joseph Qi <joseph.qi@linux.alibaba.com>
cc: ocfs2-devel@oss.oracle.com
Link: https://lore.kernel.org/r/20230623225513.2732256-14-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 40e0b090 19-Jan-2023 Peilin Ye <peilin.ye@bytedance.com>

net/sock: Introduce trace_sk_data_ready()

As suggested by Cong, introduce a tracepoint for all ->sk_data_ready()
callback implementations. For example:

<...>
iperf-609 [002] ..... 70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable
iperf-609 [002] ..... 70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable
<...>

Suggested-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 98123866 16-Dec-2022 Benjamin Coddington <bcodding@redhat.com>

Treewide: Stop corrupting socket's task_frag

Since moving to memalloc_nofs_save/restore, SUNRPC has stopped setting the
GFP_NOIO flag on sk_allocation which the networking system uses to decide
when it is safe to use current->task_frag. The results of this are
unexpected corruption in task_frag when SUNRPC is involved in memory
reclaim.

The corruption can be seen in crashes, but the root cause is often
difficult to ascertain as a crashing machine's stack trace will have no
evidence of being near NFS or SUNRPC code. I believe this problem to
be much more pervasive than reports to the community may indicate.

Fix this by having kernel users of sockets that may corrupt task_frag due
to reclaim set sk_use_task_frag = false. Preemptively correcting this
situation for users that still set sk_allocation allows them to convert to
memalloc_nofs_save/restore without the same unexpected corruptions that are
sure to follow, unlikely to show up in testing, and difficult to bisect.

CC: Philipp Reisner <philipp.reisner@linbit.com>
CC: Lars Ellenberg <lars.ellenberg@linbit.com>
CC: "Christoph Böhmwalder" <christoph.boehmwalder@linbit.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Josef Bacik <josef@toxicpanda.com>
CC: Keith Busch <kbusch@kernel.org>
CC: Christoph Hellwig <hch@lst.de>
CC: Sagi Grimberg <sagi@grimberg.me>
CC: Lee Duncan <lduncan@suse.com>
CC: Chris Leech <cleech@redhat.com>
CC: Mike Christie <michael.christie@oracle.com>
CC: "James E.J. Bottomley" <jejb@linux.ibm.com>
CC: "Martin K. Petersen" <martin.petersen@oracle.com>
CC: Valentina Manea <valentina.manea.m@gmail.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: David Howells <dhowells@redhat.com>
CC: Marc Dionne <marc.dionne@auristor.com>
CC: Steve French <sfrench@samba.org>
CC: Christine Caulfield <ccaulfie@redhat.com>
CC: David Teigland <teigland@redhat.com>
CC: Mark Fasheh <mark@fasheh.com>
CC: Joel Becker <jlbec@evilplan.org>
CC: Joseph Qi <joseph.qi@linux.alibaba.com>
CC: Eric Van Hensbergen <ericvh@gmail.com>
CC: Latchesar Ionkov <lucho@ionkov.net>
CC: Dominique Martinet <asmadeus@codewreck.org>
CC: Ilya Dryomov <idryomov@gmail.com>
CC: Xiubo Li <xiubli@redhat.com>
CC: Chuck Lever <chuck.lever@oracle.com>
CC: Jeff Layton <jlayton@kernel.org>
CC: Trond Myklebust <trond.myklebust@hammerspace.com>
CC: Anna Schumaker <anna@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>

Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# de4eda9d 15-Sep-2022 Al Viro <viro@zeniv.linux.org.uk>

use less confusing names for iov_iter direction initializers

READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.

Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 6d4a93b6 07-Oct-2022 Joseph Qi <joseph.qi@linux.alibaba.com>

ocfs2: use bitmap API in fill_node_map

Pass bits directly into fill_node_map helper and use bitmap API directly
to simplify code.

Link: https://lkml.kernel.org/r/20221007124846.186453-2-joseph.qi@linux.alibaba.com
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>


# fa60ce2c 06-May-2021 Masahiro Yamada <masahiroy@kernel.org>

treewide: remove editor modelines and cruft

The section "19) Editor modelines and other cruft" in
Documentation/process/coding-style.rst clearly says, "Do not include any
of these in source files."

I recently receive a patch to explicitly add a new one.

Let's do treewide cleanups, otherwise some people follow the existing code
and attempt to upstream their favoriate editor setups.

It is even nicer if scripts/checkpatch.pl can check it.

If we like to impose coding style in an editor-independent manner, I think
editorconfig (patch [1]) is a saner solution.

[1] https://lore.kernel.org/lkml/20200703073143.423557-1-danny@kdrag0n.dev/

Link: https://lkml.kernel.org/r/20210324054457.1477489-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org> [auxdisplay]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# a0823b5e 14-Dec-2020 Tom Rix <trix@redhat.com>

fs/ocfs2/cluster/tcp.c: remove unneeded break

A break is not needed if it is preceded by a goto

Link: https://lkml.kernel.org/r/20201019175216.2329-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c488aead 27-May-2020 Christoph Hellwig <hch@lst.de>

tcp: add tcp_sock_set_user_timeout

Add a helper to directly set the TCP_USER_TIMEOUT sockopt from kernel
space without going through a fake uaccess.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 12abc5ee 27-May-2020 Christoph Hellwig <hch@lst.de>

tcp: add tcp_sock_set_nodelay

Add a helper to directly set the TCP_NODELAY sockopt from kernel space
without going through a fake uaccess. Cleanup the callers to avoid
pointless wrappers now that this is a simple function call.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4ceb229f 01-Apr-2020 Matthew Wilcox (Oracle) <willy@infradead.org>

ocfs2: use memalloc_nofs_save instead of memalloc_noio_save

OCFS2 doesn't mind if memory reclaim makes I/Os happen; it just cares that
it won't be reentered, so it can use memalloc_nofs_save() instead of
memalloc_noio_save().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Link: http://lkml.kernel.org/r/20200326200214.1102-1-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 1a5692e4 01-Apr-2020 Alex Shi <alex.shi@linux.alibaba.com>

ocfs2: remove useless err

We don't need 'err' in these 2 places, better to remove them.

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: ChenGang <cg.chen@huawei.com>
Cc: Richard Fontana <rfontana@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1579577836-251879-1-git-send-email-alex.shi@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e581595e 11-Jul-2019 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ocfs: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.

Also, because there is no need to save the file dentry, remove all of
the variables that were being saved, and just recursively delete the
whole directory when shutting down, saving a lot of logic and local
variables.

[gregkh@linuxfoundation.org: v2]
Link: http://lkml.kernel.org/r/20190613055455.GE19717@kroah.com
Link: http://lkml.kernel.org/r/20190612152912.GA19151@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Jia Guo <guojia12@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e926d8a1 11-Jul-2019 ChenGang <cg.chen@huawei.com>

fs: ocfs: fix spelling mistake "hearbeating" -> "heartbeat"

There are some spelling mistakes in ocfs, fix it.

Link: http://lkml.kernel.org/r/1558964623-106628-1-git-send-email-cg.chen@huawei.com
Signed-off-by: ChenGang <cg.chen@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 328970de 23-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 145

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version this program is distributed in the
hope that it will be useful but without any warranty without even
the implied warranty of merchantability or fitness for a particular
purpose see the gnu general public license for more details you
should have received a copy of the gnu general public license along
with this program if not write to the free software foundation inc
59 temple place suite 330 boston ma 021110 1307 usa

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 84 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190524100844.756442981@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# aa563d7b 19-Oct-2018 David Howells <dhowells@redhat.com>

iov_iter: Separate type from direction and use accessor functions

In the iov_iter struct, separate the iterator type from the iterator
direction and use accessor functions to access them in most places.

Convert a bunch of places to use switch-statements to access them rather
then chains of bitwise-AND statements. This makes it easier to add further
iterator types. Also, this can be more efficient as to implement a switch
of small contiguous integers, the compiler can use ~50% fewer compare
instructions than it has to use bitwise-and instructions.

Further, cease passing the iterator type into the iterator setup function.
The iterator function can set that itself. Only the direction is required.

Signed-off-by: David Howells <dhowells@redhat.com>


# 229ba1f8 17-Aug-2018 wangyan <wangyan122@huawei.com>

ocfs2: clean up some unnecessary code

Several functions have some unnecessary code, clean up these code.

Link: http://lkml.kernel.org/r/5B14DF72.5020800@huawei.com
Signed-off-by: Yan Wang <wangyan122@huawei.com>
Reviewed-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 6da2ec56 12-Jun-2018 Kees Cook <keescook@chromium.org>

treewide: kmalloc() -> kmalloc_array()

The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
patch replaces cases of:

kmalloc(a * b, gfp)

with:
kmalloc_array(a * b, gfp)

as well as handling cases of:

kmalloc(a * b * c, gfp)

with:

kmalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

kmalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

kmalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The tools/ directory was manually excluded, since it has its own
implementation of kmalloc().

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
kmalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
kmalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
kmalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_ID)
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_ID
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_CONST)
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_CONST
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_ID)
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_ID
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_CONST)
+ COUNT_CONST, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_CONST
+ COUNT_CONST, sizeof(THING)
, ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kmalloc
+ kmalloc_array
(
- SIZE * COUNT
+ COUNT, SIZE
, ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
kmalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
kmalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
kmalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(
- (E1) * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * (E3)
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
kmalloc(sizeof(THING) * C2, ...)
|
kmalloc(sizeof(TYPE) * C2, ...)
|
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(C1 * C2, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (E2)
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * E2
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (E2)
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * E2
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * E2
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * (E2)
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- E1 * E2
+ E1, E2
, ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>


# 9b2c45d4 12-Feb-2018 Denys Vlasenko <dvlasenk@redhat.com>

net: make getname() functions return length rather than use int* parameter

Changes since v1:
Added changes in these files:
drivers/infiniband/hw/usnic/usnic_transport.c
drivers/staging/lustre/lnet/lnet/lib-socket.c
drivers/target/iscsi/iscsi_target_login.c
drivers/vhost/net.c
fs/dlm/lowcomms.c
fs/ocfs2/cluster/tcp.c
security/tomoyo/network.c

Before:
All these functions either return a negative error indicator,
or store length of sockaddr into "int *socklen" parameter
and return zero on success.

"int *socklen" parameter is awkward. For example, if caller does not
care, it still needs to provide on-stack storage for the value
it does not need.

None of the many FOO_getname() functions of various protocols
ever used old value of *socklen. They always just overwrite it.

This change drops this parameter, and makes all these functions, on success,
return length of sockaddr. It's always >= 0 and can be differentiated
from an error.

Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.

rpc_sockname() lost "int buflen" parameter, since its only use was
to be passed to kernel_getsockname() as &buflen and subsequently
not used in any way.

Userspace API is not changed.

text data bss dec hex filename
30108430 2633624 873672 33615726 200ef6e vmlinux.before.o
30108109 2633612 873672 33615393 200ee21 vmlinux.o

Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
CC: David S. Miller <davem@davemloft.net>
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: linux-bluetooth@vger.kernel.org
CC: linux-decnet-user@lists.sourceforge.net
CC: linux-wireless@vger.kernel.org
CC: linux-rdma@vger.kernel.org
CC: linux-sctp@vger.kernel.org
CC: linux-nfs@vger.kernel.org
CC: linux-x25@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>


# e749d4fa 20-Sep-2017 Al Viro <viro@zeniv.linux.org.uk>

cfs2: switch to sock_recvmsg()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# e99e88a9 16-Oct-2017 Kees Cook <keescook@chromium.org>

treewide: setup_timer() -> timer_setup()

This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.

Casting from unsigned long:

void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
setup_timer(&ptr->my_timer, my_callback, ptr);

and forced object casts:

void my_callback(struct something *ptr)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);

become:

void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);

Direct function assignments:

void my_callback(unsigned long data)
{
struct something *ptr = (struct something *)data;
...
}
...
ptr->my_timer.function = my_callback;

have a temporary cast added, along with converting the args:

void my_callback(struct timer_list *t)
{
struct something *ptr = from_timer(ptr, t, my_timer);
...
}
...
ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;

And finally, callbacks without a data assignment:

void my_callback(unsigned long data)
{
...
}
...
setup_timer(&ptr->my_timer, my_callback, 0);

have their argument renamed to verify they're unused during conversion:

void my_callback(struct timer_list *unused)
{
...
}
...
timer_setup(&ptr->my_timer, my_callback, 0);

The conversion is done with the following Coccinelle script:

spatch --very-quiet --all-includes --include-headers \
-I ./arch/x86/include -I ./arch/x86/include/generated \
-I ./include -I ./arch/x86/include/uapi \
-I ./arch/x86/include/generated/uapi -I ./include/uapi \
-I ./include/generated/uapi --include ./include/linux/kconfig.h \
--dir . \
--cocci-file ~/src/data/timer_setup.cocci

@fix_address_of@
expression e;
@@

setup_timer(
-&(e)
+&e
, ...)

// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@

(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)

@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@

(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
_E->_timer@_stl.function = _callback;
|
_E->_timer@_stl.function = &_callback;
|
_E->_timer@_stl.function = (_cast_func)_callback;
|
_E->_timer@_stl.function = (_cast_func)&_callback;
|
_E._timer@_stl.function = _callback;
|
_E._timer@_stl.function = &_callback;
|
_E._timer@_stl.function = (_cast_func)_callback;
|
_E._timer@_stl.function = (_cast_func)&_callback;
)

// callback(unsigned long arg)
@change_callback_handle_cast
depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@

void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
(
... when != _origarg
_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
|
... when != _origarg
_handletype *_handle;
... when != _handle
_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
... when != _origarg
)
}

// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
depends on change_timer_function_usage &&
!change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@

void _callback(
-_origtype _origarg
+struct timer_list *t
)
{
+ _handletype *_origarg = from_timer(_origarg, t, _timer);
+
... when != _origarg
- (_handletype *)_origarg
+ _origarg
... when != _origarg
}

// Avoid already converted callbacks.
@match_callback_converted
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@

void _callback(struct timer_list *t)
{ ... }

// callback(struct something *handle)
@change_callback_handle_arg
depends on change_timer_function_usage &&
!match_callback_converted &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@

void _callback(
-_handletype *_handle
+struct timer_list *t
)
{
+ _handletype *_handle = from_timer(_handle, t, _timer);
...
}

// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
depends on change_timer_function_usage &&
change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@

void _callback(struct timer_list *t)
{
- _handletype *_handle = from_timer(_handle, t, _timer);
}

// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
depends on change_timer_function_usage &&
!change_callback_handle_cast &&
!change_callback_handle_cast_no_arg &&
!change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@

(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)

// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@

(
_E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
;
|
_E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
;
)

// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
depends on change_timer_function_usage &&
(change_callback_handle_cast ||
change_callback_handle_cast_no_arg ||
change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@

_callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
)

// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@

(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)

@change_callback_unused_data
depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@

void _callback(
-_origtype _origarg
+struct timer_list *unused
)
{
... when != _origarg
}

Signed-off-by: Kees Cook <keescook@chromium.org>


# d47736fa 03-May-2017 Geliang Tang <geliangtang@gmail.com>

fs/ocfs2/cluster: use offset_in_page() macro

Use offset_in_page() macro instead of open-coding.

Link: http://lkml.kernel.org/r/4dbc77ccaaed98b183cf4dba58a4fa325fd65048.1492758503.git.geliangtang@gmail.com
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 667b8a37 03-May-2017 Geliang Tang <geliangtang@gmail.com>

fs/ocfs2/cluster: use setup_timer

Use setup_timer() instead of init_timer() to simplify the code.

Link: http://lkml.kernel.org/r/5e75bf07beb91e092d5aa36c36769949a480456a.1489060564.git.geliangtang@gmail.com
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e73a67f7 18-Mar-2017 Al Viro <viro@zeniv.linux.org.uk>

don't open-code kernel_setsockopt()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# cdfbabfb 09-Mar-2017 David Howells <dhowells@redhat.com>

net: Work around lockdep limitation in sockets that use sockets

Lockdep issues a circular dependency warning when AFS issues an operation
through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.

The theory lockdep comes up with is as follows:

(1) If the pagefault handler decides it needs to read pages from AFS, it
calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
creating a call requires the socket lock:

mmap_sem must be taken before sk_lock-AF_RXRPC

(2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind()
binds the underlying UDP socket whilst holding its socket lock.
inet_bind() takes its own socket lock:

sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET

(3) Reading from a TCP socket into a userspace buffer might cause a fault
and thus cause the kernel to take the mmap_sem, but the TCP socket is
locked whilst doing this:

sk_lock-AF_INET must be taken before mmap_sem

However, lockdep's theory is wrong in this instance because it deals only
with lock classes and not individual locks. The AF_INET lock in (2) isn't
really equivalent to the AF_INET lock in (3) as the former deals with a
socket entirely internal to the kernel that never sees userspace. This is
a limitation in the design of lockdep.

Fix the general case by:

(1) Double up all the locking keys used in sockets so that one set are
used if the socket is created by userspace and the other set is used
if the socket is created by the kernel.

(2) Store the kern parameter passed to sk_alloc() in a variable in the
sock struct (sk_kern_sock). This informs sock_lock_init(),
sock_init_data() and sk_clone_lock() as to the lock keys to be used.

Note that the child created by sk_clone_lock() inherits the parent's
kern setting.

(3) Add a 'kern' parameter to ->accept() that is analogous to the one
passed in to ->create() that distinguishes whether kernel_accept() or
sys_accept4() was the caller and can be passed to sk_alloc().

Note that a lot of accept functions merely dequeue an already
allocated socket. I haven't touched these as the new socket already
exists before we get the parameter.

Note also that there are a couple of places where I've made the accepted
socket unconditionally kernel-based:

irda_accept()
rds_rcp_accept_one()
tcp_accept_from_sock()

because they follow a sock_create_kern() and accept off of that.

Whilst creating this, I noticed that lustre and ocfs don't create sockets
through sock_create_kern() and thus they aren't marked as for-kernel,
though they appear to be internal. I wonder if these should do that so
that they use the new set of lock keys.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5b3cc15a 02-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare to move the memalloc_noio_*() APIs to <linux/sched/mm.h>

Update the .c files that depend on these APIs.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 2c935bc5 14-Nov-2016 Peter Zijlstra <peterz@infradead.org>

locking/atomic, kref: Add kref_read()

Since we need to change the implementation, stop exposing internals.

Provide kref_read() to read the current reference count; typically
used for debug messages.

Kills two anti-patterns:

atomic_read(&kref->refcount)
kref->refcount.counter

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>


# 7c0f6ba6 24-Dec-2016 Linus Torvalds <torvalds@linux-foundation.org>

Replace <asm/uaccess.h> with <linux/uaccess.h> globally

This was entirely automated, using the script by Al:

PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# bf940776 07-Oct-2016 Bhaktipriya Shridhar <bhaktipriya96@gmail.com>

fs/ocfs2/cluster: remove deprecated create_singlethread_workqueue()

The workqueue "o2net_wq" queues multiple work items viz
&old_sc->sc_shutdown_work, &sc->sc_rx_work, &sc->sc_connect_work which
require strict execution ordering. Hence, an ordered dedicated
workqueue has been used.

WQ_MEM_RECLAIM has been set to ensure forward progress under memory
pressure.

Link: http://lkml.kernel.org/r/ddc12e5766c79ba26f8a00d98049107f8a1d4866.1472590094.git.bhaktipriya96@gmail.com
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 7d65b274 26-Jul-2016 piaojun <piaojun@huawei.com>

ocfs2/cluster: clean up unnecessary assignment for 'ret'

Clean up unnecessary assignment for 'ret'.

Link: http://lkml.kernel.org/r/578C61F6.4080403@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e1daca28 17-May-2016 Eric Dumazet <edumazet@google.com>

ocfs2/cluster: block BH in TCP callbacks

TCP stack can now run from process context.

Use read_lock_bh() variant to restore previous assumption.

Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
Fixes: d41a69f1d390 ("tcp: make tcp_sendmsg() aware of socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# fce56d84 24-Jun-2015 alex chen <alex.chen@huawei.com>

ocfs2: o2net: should remove debugfs in o2net_init() out branch

Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# e6d9f86d 10-Feb-2015 Dan Carpenter <dan.carpenter@oracle.com>

ocfs2: o2net: silence uninitialized variable warning

Smatch complains that, if o2net_tx_can_proceed() returns false, then "sc"
and "ret" are uninialized or maybe we are re-using the data from previous
iteration. I do not know if we can hit this bug in real life but checking
the return value is harmless and we may as well silence the static checker
warning.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 196fe71d 10-Dec-2014 Junxiao Bi <junxiao.bi@oracle.com>

ocfs2: o2net: fix connect expired

Set nn_persistent_error to -ENOTCONN will stop reconnect since the
"stop" condition in o2net_start_connect() will be true.

stop = (nn->nn_sc ||
(nn->nn_persistent_error &&
(nn->nn_persistent_error != -ENOTCONN || timeout == 0)));

This will make connection never be established if the first connection
request is lost.

Set nn_persistent_error to 0 when connect expired to fix this. With
this changes, dlm will not be waken up when connect expired, this is OK
since dlm depends on network, dlm can do nothing in this case if waken
up. Let it wait there for network recover and connect built again to
continue.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 7e8631e8 05-Nov-2014 Al Viro <viro@zeniv.linux.org.uk>

fix breakage in o2net_send_tcp_msg()

uninitialized msghdr. Broken in "ocfs2: don't open-code kernel_recvmsg()"
by me ;-/

Cc: stable@vger.kernel.org # 3.15+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# b246d3d1 09-Oct-2014 Xue jiufei <xuejiufei@huawei.com>

ocfs2: fix a deadlock while o2net_wq doing direct memory reclaim

Fix a deadlock problem caused by direct memory reclaim in o2net_wq. The
situation is as follows:

1) Receive a connect message from another node, node queues a
work_struct o2net_listen_work.

2) o2net_wq processes this work and call the following functions:

o2net_wq
-> o2net_accept_one
-> sock_create_lite
-> sock_alloc()
-> kmem_cache_alloc with GFP_KERNEL
-> ____cache_alloc_node
->__alloc_pages_nodemask
-> do_try_to_free_pages
-> shrink_slab
-> evict
-> ocfs2_evict_inode
-> ocfs2_drop_lock
-> dlmunlock
-> o2net_send_message_vec

then o2net_wq wait for the unlock reply from master.

3) tcp layer received the reply, call o2net_data_ready() and queue
sc_rx_work, waiting o2net_wq to process this work.

4) o2net_wq is a single thread workqueue, it process the work one by
one. Right now it is still doing o2net_listen_work and cannot handle
sc_rx_work. so we deadlock.

Junxiao Bi's patch "mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set"
(http://ozlabs.org/~akpm/mmots/broken-out/mm-clear-__gfp_fs-when-pf_memalloc_noio-is-set.patch)
clears __GFP_FS in memalloc_noio_flags() besides __GFP_IO. We use
memalloc_noio_save() to set process flag PF_MEMALLOC_NOIO so that all
allocations done by this process are done as if GFP_NOIO was specified.
We are not reentering filesystem while doing memory reclaim.

Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 5046f18d 09-Oct-2014 Junxiao Bi <junxiao.bi@oracle.com>

ocfs2: don't fire quorum before connection established

Firing quorum before connection established can cause unexpected node to
reboot.

Assume there are 3 nodes in the cluster, Node 1, 2, 3. Node 2 and 3 have
wrong ip address of Node 1 in cluster.conf and global heartbeat is enabled
in the cluster. After the heatbeats are started on these three nodes,
Node 1 will reboot due to quorum fencing. It is similar case if Node 1's
networking is not ready when starting the global heartbeat.

The reboot is not friendly as customer is not fully ready for ocfs2 to
work. Fix it by not allowing firing quorum before the connection is
established. In this case, ocfs2 will wait until the wrong configuration
is fixed or networking is up to continue. Also update the log to guide
the user where to check when connection is not built for a long time.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 4a4e07c1 09-Oct-2014 Joseph Qi <joseph.qi@huawei.com>

ocfs2: call o2quo_exit() if malloc failed in o2net_init()

In o2net_init, if malloc failed, it directly returns -ENOMEM. Then
o2quo_exit won't be called in init_o2nm.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: joyce.xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8e9801df 29-Aug-2014 Junxiao Bi <junxiao.bi@oracle.com>

ocfs2: o2net: set tcp user timeout to max value

When tcp retransmit timeout(15mins), the connection will be closed.
Pending messages may be lost during this time. So we set tcp user
timeout to override the retransmit timeout to the max value. This is OK
for ocfs2 since we have disk heartbeat, if peer crash, the disk
heartbeat will timeout and it will be evicted, if disk heartbeat not
timeout and connection idle for a long time, then this means the cluster
enters split-brain state, since fence can't happen, we'd better keep the
connection and wait network recover.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# c43c363d 29-Aug-2014 Junxiao Bi <junxiao.bi@oracle.com>

ocfs2: o2net: don't shutdown connection when idle timeout

This patch series is to fix a possible message lost bug in ocfs2 when
network go bad. This bug will cause ocfs2 hung forever even network
become good again.

The messages may lost in this case. After the tcp connection is
established between two nodes, an idle timer will be set to check its
state periodically, if no messages are received during this time, idle
timer will timeout, it will shutdown the connection and try to
reconnect, so pending messages in tcp queues will be lost. This
messages may be from dlm. Dlm may get hung in this case. This may
cause the whole ocfs2 cluster hung.

This is very possible to happen when network state goes bad. Do the
reconnect is useless, it will fail if network state is still bad. Just
waiting there for network recovering may be a good idea, it will not
lost messages and some node will be fenced until cluster goes into
split-brain state, for this case, Tcp user timeout is used to override
the tcp retransmit timeout. It will timeout after 25 days, user should
have notice this through the provided log and fix the network, if they
don't, ocfs2 will fall back to original reconnect way.

This patch (of 3):

Some messages in the tcp queue maybe lost if we shutdown the connection
and reconnect when idle timeout. If packets lost and reconnect success,
then the ocfs2 cluster maybe hung.

To fix this, we can leave the connection there and do the fence decision
when idle timeout, if network recover before fence dicision is made, the
connection survive without lost any messages.

This bug can be saw when network state go bad. It may cause ocfs2 hung
forever if some packets lost. With this fix, ocfs2 will recover from
hung if network becomes good again.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 79deb3c1 10-Jun-2014 Tariq Saeed <tariq.x.saeed@oracle.com>

ocfs2/o2net: incorrect to terminate accepting connections loop upon rejecting an invalid one

When o2net-accept-one() rejects an illegal connection, it terminates the
loop picking up the remaining queued connections. This fix will
continue accepting connections till the queue is emtpy.

Addresses Orabug 17489469.

Signed-off-by: Tariq Saseed <tariq.x.saeed@oracle.com>
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 1a5c4e2a 04-Jun-2014 Fabian Frederick <fabf@skynet.be>

ocfs2: remove NULL assignments on static

Static values are automatically initialized to NULL.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 676d2369 11-Apr-2014 David S. Miller <davem@davemloft.net>

net: Fix use after free by removing length arg from sk_data_ready callbacks.

Several spots in the kernel perform a sequence like:

skb_queue_tail(&sk->s_receive_queue, skb);
sk->sk_data_ready(sk, skb->len);

But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up. So this skb->len access is potentially
to freed up memory.

Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.

And finally, no actual implementation of this callback actually uses
the length argument. And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.

So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.

Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.

Signed-off-by: David S. Miller <davem@davemloft.net>


# da8ded40 03-Apr-2014 Tariq Saeed <tariq.x.saeed@oracle.com>

ocfs2/o2net: o2net_listen_data_ready should do nothing if socket state is not TCP_LISTEN

Orabug: 17330860

When accepting an incomming connection o2net_accept_one clones a child
data socket from the parent listening socket. It then proceeds to setup
the child with callback o2net_data_ready() and sk_user_data to NULL. If
data arrives in this window, o2net_listen_data_ready will be called with
some non-deterministic value in sk_user_data (not inherited). We panic
when we page fault on sk_user_data -- in parent it is
sock_def_readable().

The fix is to recognize that this is a data socket being set up by
looking at the socket state and do nothing.

Signed-off-by: Tariq Saseed <tariq.x.saeed@oracle.com>
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 41b63efb 03-Apr-2014 Joseph Qi <joseph.qi@huawei.com>

ocfs2: fix type conversion risk when get cluster attributes

In o2nm_cluster, cl_idle_timeout_ms, cl_keepalive_delay_ms, as well as
cl_reconnect_delay_ms, are defined as type of unsigned int. So we
should also use unsigned int in the helper functions.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 920220c1 08-Feb-2014 Al Viro <viro@zeniv.linux.org.uk>

ocfs2: don't open-code kernel_recvmsg()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 66f5dcef 08-Feb-2014 Al Viro <viro@zeniv.linux.org.uk>

ocfs2: don't open-code kernel_sendmsg()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>


# 03dbe88a 11-Sep-2013 Joseph Qi <joseph.qi@huawei.com>

ocfs2: avoid possible NULL pointer dereference in o2net_accept_one()

Since o2nm_get_node_by_num() may return NULL, we add this check in
o2net_accept_one() to avoid possible NULL pointer dereference.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 9a239e4c 11-Sep-2013 Joseph Qi <joseph.qi@huawei.com>

ocfs2: adjust code style for o2net_handler_tree_lookup()

Code in o2net_handler_tree_lookup() may be corrupted by mistake. So
adjust it to promote readability.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8dd7903e 11-Sep-2013 Sunil Mushran <sunil.mushran@oracle.com>

fs/ocfs2/cluster/tcp.c: fix possible null pointer dereferences

Fix some possible null pointer dereferences that were detected by the
static code analyser, smatch.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Reported-by: Guozhonghua <guozhonghua@h3c.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 44e89cb8 03-Jul-2013 Jie Liu <jeff.liu@oracle.com>

ocfs2: adjust switch_case syntax at o2net_state_change()

Adjust switch..case syntax at o2net_state_change to meet the kernel coding
standard.

s/printk/pr_info/.

[akpm@linux-foundation.org: revert pr_foo() change]
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Cc: Gurudas Pai <gurudas.pai@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Noboru Iwamatsu <n_iwamatsu@jp.fujitsu.com>
Cc: Srinivas Eeeda <srinivas.eeda@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: Tao Ma <tm@tao.ma>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d3e3b41b 03-Jul-2013 Younger Liu <younger.liu@huawei.com>

fs/ocfs2/cluster/tcp.c: free sc->sc_page in sc_kref_release()

There is a memory leak in sc_kref_release(). When free struct
o2net_sock_container (sc), we should release sc->sc_page.

Signed-off-by: Younger Liu <younger.liu@huawei.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 6b207ba3 27-Feb-2013 Tejun Heo <tj@kernel.org>

ocfs2: convert to idr_alloc()

Convert to the much saner new idr interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# d787ab09 21-Feb-2013 Tim Gardner <tim.gardner@canonical.com>

ocfs2: remove kfree() redundant null checks

smatch analysis indicates a number of redundant NULL checks before
calling kfree(), eg:

fs/ocfs2/alloc.c:6138 ocfs2_begin_truncate_log_recovery() info:
redundant null check on *tl_copy calling kfree()

fs/ocfs2/alloc.c:6755 ocfs2_zero_range_for_truncate() info:
redundant null check on pages calling kfree()

etc....

[akpm@linux-foundation.org: revert dubious change in ocfs2_begin_truncate_log_recovery()]
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 8a168ca7 28-Dec-2012 Masanari Iida <standby24x7@gmail.com>

treewide: Fix typo in various drivers

Correct spelling typo in printk within various drivers.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>


# 4a17fd52 18-Apr-2012 Pavel Emelyanov <xemul@parallels.com>

sock: Introduce named constants for sk_reuse

Name them in a "backward compatible" manner, i.e. reuse or not
are still 1 and 0 respectively. The reuse value of 2 means that
the socket with it will forcibly reuse everyone else's port.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>


# afeacc8c 26-May-2011 Paul Gortmaker <paul.gortmaker@windriver.com>

fs: add export.h to files using EXPORT_SYMBOL/THIS_MODULE macros

These files were getting <linux/module.h> via an implicit include
path, but we want to crush those out of existence since they cost
time during compiles of processing thousands of lines of headers
for no reason. Give them the lightweight header that just contains
the EXPORT_SYMBOL infrastructure.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>


# 3ba169cc 24-Jul-2011 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/cluster: Add new function o2net_fill_node_map()

Patch adds function o2net_fill_node_map() to return the bitmap of nodes that
it is connected to. This bitmap is also accessible by the user via the debugfs
file, /sys/kernel/debug/o2net/connected_nodes.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>


# 1dfecf81 24-Jul-2011 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/cluster: Clean up messages in o2net

o2net messages needed a facelift.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>


# 25985edc 30-Mar-2011 Lucas De Marchi <lucas.demarchi@profusion.mobi>

Fix common misspellings

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>


# ed59992e 18-Mar-2011 Rakib Mullick <rakib.mullick@gmail.com>

fs,ocfs2: Move o2net_get_func_run_time under CONFIG_OCFS2_FS_STATS.

When CONFIG_DEBUG_FS=y and CONFIG_OCFS2_FS_STATS=n, we get the
following warning:

fs/ocfs2/cluster/tcp.c:213:16: warning: ‘o2net_get_func_run_time’
defined but not used

Since o2net_get_func_run_time is only called from
o2net_update_recv_stats, so move it under CONFIG_OCFS2_FS_STATS.

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: jlbec <jlbec@evilplan.org>


# e70d8450 27-Dec-2010 Randy Dunlap <randy.dunlap@oracle.com>

ocfs2: fix build for OCFS2_FS_STATS not enabled

When CONFIG_OCFS2_FS_STATS is not enabled:

fs/ocfs2/cluster/tcp.c:1254: error: implicit declaration of function 'o2net_update_recv_stats'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: ocfs2-devel@oss.oracle.com
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# e453039f 22-Dec-2010 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/cluster: Track process message timing stats for each socket

Tracks total time taken to process messages received on a socket.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# 3c193b38 22-Dec-2010 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/cluster: Track send message timing stats for each socket

Tracks total send and status times for all messages sent on a socket.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# ff1becbf 22-Dec-2010 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/cluster: Use ktime instead of timeval in struct o2net_sock_container

Replace time trackers in struct o2net_sock_container from struct timeval to
union ktime.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# 3f9c14fa 22-Dec-2010 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/cluster: Replace timeval with ktime in struct o2net_send_tracking

Replace time trackers in struct o2net_send_tracking from struct timeval to
union ktime.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# 2b190ce9 14-Dec-2010 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/cluster: Pin the remote node item in configfs

o2net pins the node item of the remote node in configfs before initiating
the connection. It is unpinned on disconnect. This is to prevent the node
item from being unlinked while it is still in use.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# 0e105d37 07-Oct-2010 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/cluster: Check slots for unconfigured live nodes

o2hb currently checks slots for configured nodes only. This patch makes
it check the slots for the live nodes too to take care of a race in which
a node is removed from the configuration but not from the live map.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>


# 50aff040 21-Aug-2010 Wu Fengguang <fengguang.wu@intel.com>

ocfs2/net: fix uninitialized ret in o2net_send_message_vec()

mmotm/fs/ocfs2/cluster/tcp.c: In function ‘o2net_send_message_vec’:
mmotm/fs/ocfs2/cluster/tcp.c:980:6: warning: ‘ret’ may be used uninitialized in this function

It seems a real bug introduced by commit 9af0b38ff3 (ocfs2/net:
Use wait_event() in o2net_send_message_vec()).

cc: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# 415cf32c 01-Aug-2010 Tristan Ye <tristan.ye@oracle.com>

O2net: Disallow o2net accept connection request from itself.

Currently, o2net_accept_one() is allowed to accept a connection from
listening node itself, such a fake connection will not be successfully
established due to no handshake detected afterwards, and later end up
with triggering connecting worker in a loop.

We're going to fix this by treating such connection request as 'invalid',
since we've got no chance of requesting connection from a node to itself
in a OCFS2 cluster.

The fix doesn't hurt user's scan for o2net-listener, it always gets a
successful connection from userpace.

Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Acked-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# 23fd9abd 31-Mar-2010 Srinivas Eeda <srinivas.eeda@oracle.com>

o2net: log socket state changes

This patch logs socket state changes that lead to socket shutdown.

Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# 03affdef 10-Mar-2010 Joe Perches <joe@perches.com>

fs/ocfs2/cluster/tcp.c: remove use of NIPQUAD, use %pI4

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 6efd8066 05-Feb-2010 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/cluster: Make o2net connect messages KERN_NOTICE

Connect and disconnect messages are more than informational as they are required
during root cause analysis for failures. This patch changes them from KERN_INFO
to KERN_NOTICE.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Acked-by: Mark Faseh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# 2bd63216 25-Jan-2010 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/trivial: Remove trailing whitespaces

Patch removes trailing whitespaces.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# 9af0b38f 11-Jun-2009 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/net: Use wait_event() in o2net_send_message_vec()

Replace wait_event_interruptible() with wait_event() in o2net_send_message_vec().
This is because this function is called by the dlm that expects signals to be
blocked.

Fixes oss bugzilla#1126
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1126

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>


# be859405 31-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com>

fs: replace NIPQUAD()

Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 18496e80 06-Aug-2008 Adrian Bunk <bunk@kernel.org>

[PATCH] ocfs2/cluster/tcp.c: make some functions static

Commit 0f475b2abed6cbccee1da20a0bef2895eb2a0edd (ocfs2/net: Silence build
warnings) made sense as far as it fixed compile warnings, but it was not
required that it made the functions global.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>


# 0f475b2a 12-May-2008 Sunil Mushran <sunil.mushran@oracle.com>

[PATCH 3/3] ocfs2/net: Silence build warnings

This patch silences the build warnings concerning o2net_init_nst()
and friends when building without CONFIG_DEBUG_FS enabled.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>


# 2309e9e0 14-Apr-2008 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2/net: Add debug interface to o2net

This patch exposes o2net information via debugfs. The information includes
the list of sockets (sock_containers) as well as the list of outstanding
messages (send_tracking). Useful for o2dlm debugging.

(This patch is derived from an earlier one written by Zach Brown that
exposed the same information via /proc.)

[Mark: checkpatch fixes]

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Reviewed-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>


# 409753bf 28-Mar-2008 Jeff Mahoney <jeffm@suse.com>

ocfs2/cluster: Get rid of arguments to the timeout routines

We keep seeing bug reports related to NULL pointer derefs in
o2net_set_nn_state(). When I originally wrote up the configurable timeout
patch, I had tried to plan for multiple clusters. This was silly.

The timeout routines all use o2nm_single_cluster so there's no point in
passing an argument at all. This patch removes the arguments and kills those
bugs dead.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>


# 5cc3bf27 05-Mar-2008 Tao Ma <tao.ma@oracle.com>

ocfs2: Reconnect after idle time out.

Currently, o2net connects to a node on hb_up and disconnects on
hb_down and net timeout.

It disconnects on net timeout is ok, but it should attempt to
reconnect back. This is because sometimes nodes get overloaded
enough that the network connection breaks but the disk hb does not.
And if we get into that situation, we either fence (unnecessarily)
or wait for its disk hb to die (and sometimes hang in the process).

So in this updated scheme, when the network disconnects, we keep
attempting to reconnect till we succeed or we get a disk hb down
event.

If the other node is really dead, then we will eventually get a
node down event. If not, we should be able to connect again and
continue.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>


# cdef59a9 05-Mar-2008 Tao Ma <tao.ma@oracle.com>

ocfs2: Fix NULL pointer dereferences in o2net

In some situations, ocfs2_set_nn_state might get called with sc = NULL and
valid = 0. If sc = NULL, we can't dereference it to get the o2nm_node
member. Instead, do what o2net_initialize_handshake does and use NULL when
calling o2net_reconnect_delay and o2net_idle_timeout.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# 97bd7919 05-Dec-2007 Al Viro <viro@ftp.linux.org.uk>

remove nonsense force-casts from ocfs2

endianness annotations in networking code had been in place for quite a
while; in particular, sin_port and s_addr are annotated as big-endian.

Code in ocfs2 had __force casts added apparently to shut the sparse
warnings up; of course, these days they only serve to *produce* warnings
for no reason whatsoever...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>


# 91cf45f0 12-Nov-2007 Trond Myklebust <Trond.Myklebust@netapp.com>

[NET]: Add the helper kernel_sock_shutdown()

...and fix a couple of bugs in the NBD, CIFS and OCFS2 socket handlers.

Looking at the sock->op->shutdown() handlers, it looks as if all of them
take a SHUT_RD/SHUT_WR/SHUT_RDWR argument instead of the
RCV_SHUTDOWN/SEND_SHUTDOWN arguments.
Add a helper, and then define the SHUT_* enum to ensure that kernel users
of shutdown() don't get confused.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Acked-by: Mark Fasheh <mark.fasheh@oracle.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# ce17204a 30-Jul-2007 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2: Retry sendpage() if it returns EAGAIN

Instead of treating EAGAIN, returned from sendpage(), as an error, this
patch retries the operation.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# 800deef3 17-May-2007 Christoph Hellwig <hch@lst.de>

[PATCH] ocfs2: use list_for_each_entry where benefical

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# 14829422 14-Jun-2007 Joel Becker <joel.becker@oracle.com>

ocfs2: Depend on configfs heartbeat items.

ocfs2 mounts require a heartbeat region. Use the new configfs_depend_item()
facility to actually depend on them so they can't go away from under us.

First, teach cluster/nodemanager.c to depend an item on the o2cb subsystem.
Then teach o2hb_register_callbacks to take a UUID and depend on the
appropriate region. Finally, teach all users of o2hb to pass a UUID or
NULL if they don't require a pin.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# 5fdf1e67 27-Apr-2007 Mark Fasheh <mark.fasheh@oracle.com>

ocfs2: fix sparse warnings in fs/ocfs2/cluster

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# c24f72cc 03-Feb-2007 Joel Becker <joel.becker@oracle.com>

ocfs2: Proper cleanup in case of error in ocfs2_register_hb_callbacks()

If ocfs2_register_hb_callbacks() succeeds on its first callback but fails
its second, it doesn't release the first on the way out. Fix that.

While we're at it, o2hb_unregister_callback() never returns anything but
0, so let's make it void.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# 925037bc 23-Jan-2007 Zhen Wei <zwei@novell.com>

ocfs2: introduce sc->sc_send_lock to protect outbound outbound messages

When there is a lot of multithreaded I/O usage, two threads can collide
while sending out a message to the other nodes. This is due to the lack of
locking between threads while sending out the messages.

When a connected TCP send(), sendto(), or sendmsg() arrives in the Linux
kernel, it eventually comes through tcp_sendmsg(). tcp_sendmsg() protects
itself by acquiring a lock at invocation by calling lock_sock().
tcp_sendmsg() then loops over the buffers in the iovec, allocating
associated sk_buff's and cache pages for use in the actual send. As it does
so, it pushes the data out to tcp for actual transmission. However, if one
of those allocation fails (because a large number of large sends is being
processed, for example), it must wait for memory to become available. It
does so by jumping to wait_for_sndbuf or wait_for_memory, both of which
eventually cause a call to sk_stream_wait_memory(). sk_stream_wait_memory()
contains a code path that calls sk_wait_event(). Finally, sk_wait_event()
contains the call to release_sock().

The following patch adds a lock to the socket container in order to
properly serialize outbound requests.

From: Zhen Wei <zwei@novell.com>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# ab81afd3 29-Jan-2007 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2: Binds listener to the configured ip address

This patch binds the o2net listener to the configured ip address
instead of INADDR_ANY for security. Fixes oss.oracle.com bugzilla#814.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# d74c9803 17-Jan-2007 Kurt Hackel <kurt.hackel@oracle.com>

ocfs2: Added post handler callable function in o2net message handler

Currently o2net allows one handler function per message type. This
patch adds the ability to call another function to be called after
the handler has returned the message to the other node.

Handlers are now given the option of returning a context (in the form of a
void **) which will be passed back into the post message handler function.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# cd861280 13-Dec-2006 Robert P. J. Day <rpjday@mindspring.com>

[PATCH] Fix numerous kcalloc() calls, convert to kzalloc()

All kcalloc() calls of the form "kcalloc(1,...)" are converted to the
equivalent kzalloc() calls, and a few kcalloc() calls with the incorrect
ordering of the first two arguments are fixed.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Adam Belay <ambx1@neo.rr.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Greg KH <greg@kroah.com>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# 828ae6af 04-Dec-2006 Andrew Beekhof <abeekhof@suse.de>

[patch 3/3] OCFS2 Configurable timeouts - Protocol changes

Modify the OCFS2 handshake to ensure essential timeouts are configured
identically on all nodes.

Only allow changes when there are no connected peers

Improves the logic in o2net_advance_rx() which broke now that
sizeof(struct o2net_handshake) is greater than sizeof(struct o2net_msg)

Included is the field for userspace-heartbeat timeout to avoid the need for
further protocol changes.

Uses a global spinlock to ensure the decisions to update configfs entries
are made on the correct value. The region covered by the spinlock when
incrementing the counter is much larger as this is the more critical case.

Small cleanup contributed by Adrian Bunk <bunk@stusta.de>

Signed-off-by: Andrew Beekhof <abeekhof@suse.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# b5dd8030 04-Dec-2006 Jeff Mahoney <jeffm@suse.de>

[patch 2/3] OCFS2 Configurable timeouts

Allow configuration of OCFS2 timeouts from userspace via configfs

Signed-off-by: Andrew Beekhof <abeekhof@suse.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# c4028958 22-Nov-2006 David Howells <dhowells@redhat.com>

WorkStruct: make allyesconfig

Fix up for make allyesconfig.

Signed-Off-By: David Howells <dhowells@redhat.com>


# 781ee3e2 27-Apr-2006 Sunil Mushran <sunil.mushran@oracle.com>

ocfs2: Cleanup message prints

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# 34af946a 27-Jun-2006 Ingo Molnar <mingo@elte.hu>

[PATCH] spin/rwlock init cleanups

locking init cleanups:

- convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
- convert rwlocks in a similar manner

this patch was generated automatically.

Motivation:

- cleanliness
- lockdep needs control of lock initialization, which the open-coded
variants do not give
- it's also useful for -rt and for lock debugging in general

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>


# b7668c72 01-Mar-2006 Sunil Mushran <sunil.mushran@oracle.com>

[PATCH] ocfs2: added source addr to bind() in o2net_start_connect()

to prevent confusion when a virtual ip is created on the same interface

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# 215c7f9f 01-Feb-2006 Mark Fasheh <mark.fasheh@oracle.com>

[PATCH] ocfs2: fix compile warnings

Fix a couple of compile warnings found when compiling on a ppc64 build box.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>


# 98211489 15-Dec-2005 Zach Brown <zach.brown@oracle.com>

[PATCH] OCFS2: The Second Oracle Cluster Filesystem

Node messaging via tcp. Used by the dlm and the file system for point
to point communication between nodes.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>