History log of /linux-master/net/dccp/diag.c
Revision Date Author Comments
# db591469 22-Jan-2024 Eric Dumazet <edumazet@google.com>

inet_diag: add module pointer to "struct inet_diag_handler"

Following patch is going to use RCU instead of
inet_diag_table_mutex acquisition.

This patch is a preparation, no change of behavior yet.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# 0df6d328 25-Feb-2020 Martin KaFai Lau <kafai@fb.com>

inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data

The INET_DIAG_REQ_BYTECODE nlattr is currently re-found every time when
the "dump()" is re-started.

In a latter patch, it will also need to parse the new
INET_DIAG_REQ_SK_BPF_STORAGES nlattr to learn the map_fds. Thus, this
patch takes this chance to store the parsed nlattr in cb->data
during the "start" time of a dump.

By doing this, the "bc" argument also becomes unnecessary
and is removed. Also, the two copies of the INET_DIAG_REQ_BYTECODE
parsing-audit logic between compat/current version can be
consolidated to one.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200225230415.1975555-1-kafai@fb.com


# 5682d393 25-Feb-2020 Martin KaFai Lau <kafai@fb.com>

inet_diag: Refactor inet_sk_diag_fill(), dump(), and dump_one()

In a latter patch, there is a need to update "cb->min_dump_alloc"
in inet_sk_diag_fill() as it learns the diffierent bpf_sk_storages
stored in a sk while dumping all sk(s) (e.g. tcp_hashinfo).

The inet_sk_diag_fill() currently does not take the "cb" as an argument.
One of the reason is inet_sk_diag_fill() is used by both dump_one()
and dump() (which belong to the "struct inet_diag_handler". The dump_one()
interface does not pass the "cb" along.

This patch is to make dump_one() pass a "cb". The "cb" is created in
inet_diag_cmd_exact(). The "nlh" and "in_skb" are stored in "cb" as
the dump() interface does. The total number of args in
inet_sk_diag_fill() is also cut from 10 to 7 and
that helps many callers to pass fewer args.

In particular,
"struct user_namespace *user_ns", "u32 pid", and "u32 seq"
can be replaced by accessing "cb->nlh" and "cb->skb".

A similar argument reduction is also made to
inet_twsk_diag_fill() and inet_req_diag_fill().

inet_csk_diag_dump() and inet_csk_diag_fill() are also removed.
They are mostly equivalent to inet_sk_diag_fill(). Their repeated
usages are very limited. Thus, inet_sk_diag_fill() is directly used
in those occasions.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200225230409.1975173-1-kafai@fb.com


# d2912cb1 04-Jun-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500

Based on 2 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 3fd22af8 15-Jun-2015 Craig Gallek <kraig@google.com>

sock_diag: specify info_size per inet protocol

Previously, there was no clear distinction between the inet protocols
that used struct tcp_info to report information and those that didn't.
This change adds a specific size attribute to the inet_diag_handler
struct which defines these interfaces. This will make dispatching
sock_diag get_info requests identical for all inet protocols in a
following patch.

Tested: ss -au
Tested: ss -at
Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 34160ea3 10-Mar-2015 Eric Dumazet <edumazet@google.com>

inet_diag: add const to inet_diag_req_v2

diag dumpers should not modify the request.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# c8991362 10-Jan-2012 Pavel Emelyanov <xemul@parallels.com>

inet_diag: Rename inet_diag_req into inet_diag_req_v2

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# aec8dc62 14-Dec-2011 Pavel Emelyanov <xemul@parallels.com>

sock_diag: Fix module netlink aliases

I've made a mistake when fixing the sock_/inet_diag aliases :(

1. The sock_diag layer should request the family-based alias,
not just the IPPROTO_IP one;
2. The inet_diag layer should request for AF_INET+protocol alias,
not just the protocol one.

Thus fix this.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 1942c518 08-Dec-2011 Pavel Emelyanov <xemul@parallels.com>

inet_diag: Generalize inet_diag dump and get_exact calls

Introduce two callbacks in inet_diag_handler -- one for dumping all
sockets (with filters) and the other one for dumping a single sk.

Replace direct calls to icsk handlers with indirect calls to callbacks
provided by handlers.

Make existing TCP and DCCP handlers use provided helpers for icsk-s.

The UDP diag module will provide its own.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7b35eadd 08-Dec-2011 Pavel Emelyanov <xemul@parallels.com>

inet_diag: Remove indirect sizeof from inet diag handlers

There's an info_size value stored on inet_diag_handler, but for existing
code this value is effectively constant, so just use sizeof(struct tcp_info)
where required.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f13c95f0 06-Dec-2011 Pavel Emelyanov <xemul@parallels.com>

inet_diag: Switch from _GETSOCK to IPPROTO_ numbers

Sorry, but the vger didn't let this message go to the list. Re-sending it with
less spam-filter-prone subject.

When dumping the AF_INET/AF_INET6 sockets user will also specify the protocol,
so prepare the protocol diag handlers to work with IPPROTO_ constants.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7f1fb60c 06-Dec-2011 Pavel Emelyanov <xemul@parallels.com>

inet_diag: Partly rename inet_ to sock_

The ultimate goal is to get the sock_diag module, that works in
family+protocol terms. Currently this is suitable to do on the
inet_diag basis, so rename parts of the code. It will be moved
to sock_diag.c later.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a693722a 17-Dec-2008 Arnaldo Carvalho de Melo <acme@redhat.com>

dccp_diag: LISTEN sockets don't have CCIDs

And thus when we try to use 'ss -danemi' on these sockets that have no
ccid blocks (data collected using systemtap after I fixed the problem):

dccp_diag_get_info sk=0xffff8801220a3100, dp->dccps_hc_rx_ccid=0x0000000000000000, dp->dccps_hc_tx_ccid=0x0000000000000000

We get an OOPS:

mica.ghostprotocols.net login: BUG: unable to handle kernel NULL pointer
dereferenc0
IP: [<ffffffffa0136082>] dccp_diag_get_info+0x82/0xc0 [dccp_diag]
PGD 12106f067 PUD 122488067 PMD 0
Oops: 0000 [#1] PREEMPT

Fix is trivial, and 'ss -d' is working again:

[root@mica ~]# ss -danemi
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 0 *:5001 *:*
ino:7288 sk:220a3100ffff8801
mem:(r0,w0,f0,t0) cwnd:0 ssthresh:0
[root@mica ~]#

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6fdd34d4 08-Dec-2008 Gerrit Renker <gerrit@erg.abdn.ac.uk>

dccp ccid-2: Phase out the use of boolean Ack Vector sysctl

This removes the use of the sysctl and the minisock variable for the Send Ack
Vector feature, as it now is handled fully dynamically via feature negotiation
(i.e. when CCID-2 is enabled, Ack Vectors are automatically enabled as per
RFC 4341, 4.).

Using a sysctl in parallel to this implementation would open the door to
crashes, since much of the code relies on tests of the boolean minisock /
sysctl variable. Thus, this patch replaces all tests of type

if (dccp_msk(sk)->dccpms_send_ack_vector)
/* ... */
with
if (dp->dccps_hc_rx_ackvec != NULL)
/* ... */

The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
negotiation concluded that Ack Vectors are to be used on the half-connection.
Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
so that the test is a valid one.

The activation handler for Ack Vectors is called as soon as the feature
negotiation has concluded at the
* server when the Ack marking the transition RESPOND => OPEN arrives;
* client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.

Adding the sequence number of the Response packet to the Ack Vector has been
removed, since
(a) connection establishment implies that the Response has been received;
(b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
this entry will always be ignored;
(c) it can not be used for anything useful - to detect loss for instance, only
packets received after the loss can serve as pseudo-dupacks.

There was a FIXME to change the error code when dccp_ackvec_add() fails.
I removed this after finding out that:
* the check whether ackno < ISN is already made earlier,
* this Response is likely the 1st packet with an Ackno that the client gets,
* so when dccp_ackvec_add() fails, the reason is likely not a packet error.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a7a0d6a8 19-Nov-2008 Eric Dumazet <dada1@cosmosbay.com>

net: inet_diag_handler structs can be const

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 410e27a4 09-Sep-2008 Gerrit Renker <gerrit@erg.abdn.ac.uk>

This reverts "Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/dccp_exp"
as it accentally contained the wrong set of patches. These will be
submitted separately.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>


# b235dc4a 03-Sep-2008 Gerrit Renker <gerrit@erg.abdn.ac.uk>

dccp ccid-2: Phase out the use of boolean Ack Vector sysctl

This removes the use of the sysctl and the minisock variable for the Send Ack
Vector feature, which is now handled fully dynamically via feature negotiation;
i.e. when CCID2 is enabled, Ack Vectors are automatically enabled (as per
RFC 4341, 4.).

Using a sysctl in parallel to this implementation would open the door to
crashes, since much of the code relies on tests of the boolean minisock /
sysctl variable. Thus, this patch replaces all tests of type

if (dccp_msk(sk)->dccpms_send_ack_vector)
/* ... */
with
if (dp->dccps_hc_rx_ackvec != NULL)
/* ... */

The dccps_hc_rx_ackvec is allocated by the dccp_hdlr_ackvec() when feature
negotiation concluded that Ack Vectors are to be used on the half-connection.
Otherwise, it is NULL (due to dccp_init_sock/dccp_create_openreq_child),
so that the test is a valid one.

The activation handler for Ack Vectors is called as soon as the feature
negotiation has concluded at the
* server when the Ack marking the transition RESPOND => OPEN arrives;
* client after it has sent its ACK, marking the transition REQUEST => PARTOPEN.

Adding the sequence number of the Response packet to the Ack Vector has been
removed, since
(a) connection establishment implies that the Response has been received;
(b) the CCIDs only look at packets received in the (PART)OPEN state, i.e.
this entry will always be ignored;
(c) it can not be used for anything useful - to detect loss for instance, only
packets received after the loss can serve as pseudo-dupacks.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>


# 305e1e96 21-Oct-2007 Jean Delvare <jdelvare@suse.de>

[INET]: Let inet_diag and friends autoload

By adding module aliases to inet_diag, tcp_diag and dccp_diag, we let
them load automatically as needed. This makes tools like "ss" run
faster.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6ab3d562 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de>

Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>


# a4bf3902 20-Mar-2006 Arnaldo Carvalho de Melo <acme@mandriva.com>

[DCCP] minisock: Rename struct dccp_options to struct dccp_minisock

This will later be included in struct dccp_request_sock so that we can
have per connection feature negotiation state while in the 3way
handshake, when we clone the DCCP_ROLE_LISTEN socket (in
dccp_create_openreq_child) we'll just copy this state from
dreq_minisock to dccps_minisock.

Also the feature negotiation and option parsing code will mostly touch
dccps_minisock, which will simplify some stuff.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d83d8461 14-Dec-2005 Arnaldo Carvalho de Melo <acme@mandriva.com>

[IP_SOCKGLUE]: Remove most of the tcp specific calls

As DCCP needs to be called in the same spots.

Now we have a member in inet_sock (is_icsk), set at sock creation time from
struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and
DCCP) to see if a struct sock instance is a inet_connection_sock for places
like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if
sk_type was SOCK_STREAM, that is insufficient because we now use the same code
for DCCP, that has sk_type SOCK_DCCP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 2babe1f6 23-Aug-2005 Arnaldo Carvalho de Melo <acme@mandriva.com>

[DCCP]: Introduce dccp_get_info

And also hc_tx and hc_rx get_info functions for the CCIDs to fill in
information that is specific to them.

For now reusing struct tcp_info, later I'll try to figure out a better
solution, for now its really nice to get this kind of info:

[root@qemu ~]# ./ss -danemi
State Recv-Q Send-Q Local Addr:Port Peer Addr:Port
LISTEN 0 0 *:5001 *:* ino:628 sk:c1340040
mem:(r0,w0,f0,t0) cwnd:0 ssthresh:0
ESTAB 0 0 172.20.0.2:5001 172.20.0.1:32785 ino:629 sk:c13409a0
mem:(r0,w0,f0,t0) ts rto:1000 rtt:0.004/0 cwnd:0 ssthresh:0 rcv_rtt:61.377

This, for instance, shows that we're not congestion controlling ACKs,
as the above output is in the ttcp receiving host, and ttcp is a one
way app, i.e. the received never calls sendmsg, so
ccid_hc_tx_send_packet is never called, so the TX half connection
stays in TFRC_SSTATE_NO_SENT state and hctx_rtt is never calculated,
stays with the value set in ccid3_hc_tx_init, 4us, as show above in
milliseconds (0.004ms), upcoming patches will fix this.

rcv_rtt seems sane tho, matching ping results :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# a8c2190e 11-Aug-2005 Arnaldo Carvalho de Melo <acme@mandriva.com>

[INET_DIAG]: Rename tcp_diag.[ch] to inet_diag.[ch]

Next changeset will introduce net/ipv4/tcp_diag.c, moving the code that was put
transitioanlly in inet_diag.c.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 73c1f4a0 11-Aug-2005 Arnaldo Carvalho de Melo <acme@mandriva.com>

[TCPDIAG]: Just rename everything to inet_diag

Next changeset will rename tcp_diag.[ch] to inet_diag.[ch].

I'm taking this longer route so as to easy review, making clear the changes
made all along the way.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4f5736c4 12-Aug-2005 Arnaldo Carvalho de Melo <acme@mandriva.com>

[TCPDIAG]: Introduce inet_diag_{register,unregister}

Next changeset will rename tcp_diag to inet_diag and move the tcp_diag code out
of it and into a new tcp_diag.c, similar to the net/dccp/diag.c introduced in
this changeset, completing the transition to a generic inet_diag
infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>