History log of /linux-master/net/mctp/route.c
Revision Date Author Comments
# 1394c1de 19-Feb-2024 Jeremy Kerr <jk@codeconstruct.com.au>

net: mctp: copy skb ext data when fragmenting

If we're fragmenting on local output, the original packet may contain
ext data for the MCTP flows. We'll want this in the resulting fragment
skbs too.

So, do a skb_ext_copy() in the fragmentation path, and implement the
MCTP-specific parts of an ext copy operation.

Fixes: 67737c457281 ("mctp: Pass flow data & flow release events to drivers")
Reported-by: Jian Zhang <zhangjian.3032@bytedance.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# 43e67955 19-Feb-2024 Jeremy Kerr <jk@codeconstruct.com.au>

net: mctp: separate key correlation across nets

Currently, we lookup sk_keys from the entire struct net_namespace, which
may contain multiple MCTP net IDs. In those cases we want to distinguish
between endpoints with the same EID but different net ID.

Add the net ID data to the struct mctp_sk_key, populate on add and
filter on this during route lookup.

For the ioctl interface, we use a default net of
MCTP_INITIAL_DEFAULT_NET (ie., what will be in use for single-net
configurations), but we'll extend the ioctl interface to provide
net-specific tag allocation in an upcoming change.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# fc944ecc 19-Feb-2024 Jeremy Kerr <jk@codeconstruct.com.au>

net: mctp: make key lookups match the ANY address on either local or peer

We may have an ANY address in either the local or peer address of a
sk_key, and may want to match on an incoming daddr or saddr being ANY.

Do this by altering the conflicting-tag lookup to also accept ANY as
the local/peer address.

We don't want mctp_address_matches to match on the requested EID being
ANY, as that is a specific lookup case on packet input.

Reported-by: Eric Chuang <echuang@google.com>
Reported-by: Anthony <anthonyhkf@google.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# aee6479a 19-Feb-2024 Jeremy Kerr <jk@codeconstruct.com.au>

net: mctp: Add some detail on the key allocation implementation

We could do with a little more comment on where MCTP_ADDR_ANY will match
in the key allocations.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# ee076b73 19-Feb-2024 Jeremy Kerr <jk@codeconstruct.com.au>

net: mctp: avoid confusion over local/peer dest/source addresses

We have a double-swap of local and peer addresses in
mctp_alloc_local_tag; the arguments in both call sites are swapped, but
there is also a swap in the implementation of alloc_local_tag. This is
opaque because we're using source/dest address references, which don't
match the local/peer semantics.

Avoid this confusion by naming the arguments as 'local' and 'peer', and
remove the double swap. The calling order now matches mctp_key_alloc.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>


# 3773d65a 20-Feb-2024 Jeremy Kerr <jk@codeconstruct.com.au>

net: mctp: take ownership of skb in mctp_local_output

Currently, mctp_local_output only takes ownership of skb on success, and
we may leak an skb if mctp_local_output fails in specific states; the
skb ownership isn't transferred until the actual output routing occurs.

Instead, make mctp_local_output free the skb on all error paths up to
the route action, so it always consumes the passed skb.

Fixes: 833ef3b91de6 ("mctp: Populate socket implementation")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240220081053.1439104-1-jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 9990889b 15-Feb-2024 Jeremy Kerr <jk@codeconstruct.com.au>

net: mctp: put sock on tag allocation failure

We may hold an extra reference on a socket if a tag allocation fails: we
optimistically allocate the sk_key, and take a ref there, but do not
drop if we end up not using the allocated key.

Ensure we're dropping the sock on this failure by doing a proper unref
rather than directly kfree()ing.

Fixes: de8a6b15d965 ("net: mctp: add an explicit reference from a mctp_sk_key to sock")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/ce9b61e44d1cdae7797be0c5e3141baf582d23a0.1707983487.git.jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 5093bbfc 09-Oct-2023 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: perform route lookups under a RCU read-side lock

Our current route lookups (mctp_route_lookup and mctp_route_lookup_null)
traverse the net's route list without the RCU read lock held. This means
the route lookup is subject to preemption, resulting in an potential
grace period expiry, and so an eventual kfree() while we still have the
route pointer.

Add the proper read-side critical section locks around the route
lookups, preventing premption and a possible parallel kfree.

The remaining net->mctp.routes accesses are already under a
rcu_read_lock, or protected by the RTNL for updates.

Based on an analysis from Sili Luo <rootlab@huawei.com>, where
introducing a delay in the route lookup could cause a UAF on
simultaneous sendmsg() and route deletion.

Reported-by: Sili Luo <rootlab@huawei.com>
Fixes: 889b7da23abf ("mctp: Add initial routing framework")
Cc: stable@vger.kernel.org
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/29c4b0e67dc1bf3571df3982de87df90cae9b631.1696837310.git.jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# f60ce8a4 15-Jun-2023 Lin Ma <linma@zju.edu.cn>

net: mctp: remove redundant RTN_UNICAST check

Current mctp_newroute() contains two exactly same check against
rtm->rtm_type

static int mctp_newroute(...)
{
...
if (rtm->rtm_type != RTN_UNICAST) { // (1)
NL_SET_ERR_MSG(extack, "rtm_type must be RTN_UNICAST");
return -EINVAL;
}
...
if (rtm->rtm_type != RTN_UNICAST) // (2)
return -EINVAL;
...
}

This commits removes the (2) check as it is redundant.

Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Acked-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20230615152240.1749428-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# b98e1a04 23-Jan-2023 Jeremy Kerr <jk@codeconstruct.com.au>

net: mctp: mark socks as dead on unhash, prevent re-add

Once a socket has been unhashed, we want to prevent it from being
re-used in a sk_key entry as part of a routing operation.

This change marks the sk as SOCK_DEAD on unhash, which prevents addition
into the net's key list.

We need to do this during the key add path, rather than key lookup, as
we release the net keys_lock between those operations.

Fixes: 4a992bbd3650 ("mctp: Implement message fragmentation & reassembly")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6e54ea37 23-Jan-2023 Paolo Abeni <pabeni@redhat.com>

net: mctp: hold key reference when looking up a general key

Currently, we have a race where we look up a sock through a "general"
(ie, not directly associated with the (src,dest,tag) tuple) key, then
drop the key reference while still holding the key's sock.

This change expands the key reference until we've finished using the
sock, and hence the sock reference too.

Commit message changes from Jeremy Kerr <jk@codeconstruct.com.au>.

Reported-by: Noam Rathaus <noamr@ssd-disclosure.com>
Fixes: 73c618456dc5 ("mctp: locking, lifetime and validity changes for sk_keys")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# de8a6b15 23-Jan-2023 Jeremy Kerr <jk@codeconstruct.com.au>

net: mctp: add an explicit reference from a mctp_sk_key to sock

Currently, we correlate the mctp_sk_key lifetime to the sock lifetime
through the sock hash/unhash operations, but this is pretty tenuous, and
there are cases where we may have a temporary reference to an unhashed
sk.

This change makes the reference more explicit, by adding a hold on the
sock when it's associated with a mctp_sk_key, released on final key
unref.

Fixes: 73c618456dc5 ("mctp: locking, lifetime and validity changes for sk_keys")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d4072058 08-Nov-2022 Wei Yongjun <weiyongjun1@huawei.com>

mctp: Fix an error handling path in mctp_init()

If mctp_neigh_init() return error, the routes resources should
be released in the error handling path. Otherwise some resources
leak.

Fixes: 4d8b9319282a ("mctp: Add neighbour implementation")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Matt Johnston <matt@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20221108095517.620115-1-weiyongjun@huaweicloud.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 3a732b46 11-Oct-2022 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: prevent double key removal and unref

Currently, we have a bug where a simultaneous DROPTAG ioctl and socket
close may race, as we attempt to remove a key from lists twice, and
perform an unref for each removal operation. This may result in a uaf
when we attempt the second unref.

This change fixes the race by making __mctp_key_remove tolerant to being
called on a key that has already been removed from the socket/net lists,
and only performs the unref when we do the actual remove. We also need
to hold the list lock on the ioctl cleanup path.

This fix is based on a bug report and comprehensive analysis from
butt3rflyh4ck <butterflyhuangxx@gmail.com>, found via syzkaller.

Cc: stable@vger.kernel.org
Fixes: 63ed1aab3d40 ("mctp: Add SIOCMCTP{ALLOC,DROP}TAG ioctls for tag control")
Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4a9dda1c 31-Mar-2022 Matt Johnston <matt@codeconstruct.com.au>

mctp: Use output netdev to allocate skb headroom

Previously the skb was allocated with headroom MCTP_HEADER_MAXLEN,
but that isn't sufficient if we are using devs that are not MCTP
specific.

This also adds a check that the smctp_halen provided to sendmsg for
extended addressing is the correct size for the netdev.

Fixes: 833ef3b91de6 ("mctp: Populate socket implementation")
Reported-by: Matthew Rinaldi <mjrinal@g.clemson.edu>
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 60be976ac 31-Mar-2022 Matt Johnston <matt@codeconstruct.com.au>

mctp: Fix check for dev_hard_header() result

dev_hard_header() returns the length of the header, so
we need to test for negative errors rather than non-zero.

Fixes: 889b7da23abf ("mctp: Add initial routing framework")
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 8d783197 21-Feb-2022 Matt Johnston <matt@codeconstruct.com.au>

mctp: Fix warnings reported by clang-analyzer

net/mctp/device.c:140:11: warning: Assigned value is garbage or undefined
[clang-analyzer-core.uninitialized.Assign]
mcb->idx = idx;

- Not a real problem due to how the callback runs, fix the warning.

net/mctp/route.c:458:4: warning: Value stored to 'msk' is never read
[clang-analyzer-deadcode.DeadStores]
msk = container_of(key->sk, struct mctp_sock, sk);

- 'msk' dead assignment can be removed here.

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# e297db3e 21-Feb-2022 Matt Johnston <matt@codeconstruct.com.au>

mctp: Fix incorrect netdev unref for extended addr

In the extended addressing local route output codepath
dev_get_by_index_rcu() doesn't take a dev_hold() so we shouldn't
dev_put().

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# dc121c00 21-Feb-2022 Matt Johnston <matt@codeconstruct.com.au>

mctp: make __mctp_dev_get() take a refcount hold

Previously there was a race that could allow the mctp_dev refcount
to hit zero:

rcu_read_lock();
mdev = __mctp_dev_get(dev);
// mctp_unregister() happens here, mdev->refs hits zero
mctp_dev_hold(dev);
rcu_read_unlock();

Now we make __mctp_dev_get() take the hold itself. It is safe to test
against the zero refcount because __mctp_dev_get() is called holding
rcu_read_lock and mctp_dev uses kfree_rcu().

Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 86cdfd63 17-Feb-2022 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: add address validity checking for packet receive

This change adds some basic sanity checks for the source and dest
headers of packets on initial receive.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# cb196b72 17-Feb-2022 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: replace mctp_address_ok with more fine-grained helpers

Currently, we have mctp_address_ok(), which checks if an EID is in the
"valid" range of 8-254 inclusive. However, 0 and 255 may also be valid
addresses, depending on context. 0 is the NULL EID, which may be set
when physical addressing is used. 255 is valid as a destination address
for broadcasts.

This change renames mctp_address_ok to mctp_address_unicast, and adds
similar helpers for broadcast and null EIDs, which will be used in an
upcoming commit.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 63ed1aab 08-Feb-2022 Matt Johnston <matt@codeconstruct.com.au>

mctp: Add SIOCMCTP{ALLOC,DROP}TAG ioctls for tag control

This change adds a couple of new ioctls for mctp sockets:
SIOCMCTPALLOCTAG and SIOCMCTPDROPTAG. These ioctls provide facilities
for explicit allocation / release of tags, overriding the automatic
allocate-on-send/release-on-reply and timeout behaviours. This allows
userspace more control over messages that may not fit a simple
request/response model.

In order to indicate a pre-allocated tag to the sendmsg() syscall, we
introduce a new flag to the struct sockaddr_mctp.smctp_tag value:
MCTP_TAG_PREALLOC.

Additional changes from Jeremy Kerr <jk@codeconstruct.com.au>.

Contains a fix that was:
Reported-by: kernel test robot <lkp@intel.com>

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 0de55a7d 08-Feb-2022 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Allow keys matching any local address

Currently, we require an exact match on an incoming packet's dest
address, and the key's local_addr field.

In a future change, we may want to set up a key before packets are
routed, meaning we have no local address to match on.

This change allows key lookups to match on local_addr = MCTP_ADDR_ANY.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 8069b22d 08-Feb-2022 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Add helper for address match checking

Currently, we have a couple of paths that check that an EID matches, or
the match value is MCTP_ADDR_ANY.

Rather than open coding this, add a little helper.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7e5b6a5c 14-Feb-2022 Tom Rix <trix@redhat.com>

mctp: fix use after free

Clang static analysis reports this problem
route.c:425:4: warning: Use of memory after it is freed
trace_mctp_key_acquire(key);
^~~~~~~~~~~~~~~~~~~~~~~~~~~
When mctp_key_add() fails, key is freed but then is later
used in trace_mctp_key_acquire(). Add an else statement
to use the key only when mctp_key_add() is successful.

Fixes: 4f9e1ba6de45 ("mctp: Add tracepoints for tag/key handling")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>


# d9e56d18 02-Dec-2021 Xu Wang <vulab@iscas.ac.cn>

mctp: Remove redundant if statements

The 'if (dev)' statement already move into dev_{put , hold}, so remove
redundant if statements.

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 5cfe53cf 29-Nov-2021 Yang Yingliang <yangyingliang@huawei.com>

mctp: remove unnecessary check before calling kfree_skb()

The skb will be checked inside kfree_skb(), so remove the
outside check.

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211130031243.768823-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 76d00160 01-Dec-2021 Matt Johnston <matt@codeconstruct.com.au>

mctp: Don't let RTM_DELROUTE delete local routes

We need to test against the existing route type, not
the rtm_type in the netlink request.

Fixes: 83f0a0b7285b ("mctp: Specify route types, require rtm_type in RTM_*ROUTE messages")
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 67737c45 28-Oct-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Pass flow data & flow release events to drivers

Now that we have an extension for MCTP data in skbs, populate the flow
when a key has been created for the packet, and add a device driver
operation to inform of flow destruction.

Includes a fix for a warning with test builds:
Reported-by: kernel test robot <lkp@intel.com>

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 212c10c3 28-Oct-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Return new key from mctp_alloc_local_tag

In a future change, we will want the key available for future use after
allocating a new tag. This change returns the key from
mctp_alloc_local_tag, rather than just key->tag.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 99ce45d5 25-Oct-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Implement extended addressing

This change allows an extended address struct - struct sockaddr_mctp_ext
- to be passed to sendmsg/recvmsg. This allows userspace to specify
output ifindex and physical address information (for sendmsg) or receive
the input ifindex/physaddr for incoming messages (for recvmsg). This is
typically used by userspace for MCTP address discovery and assignment
operations.

The extended addressing facility is conditional on a new sockopt:
MCTP_OPT_ADDR_EXT; userspace must explicitly enable addressing before
the kernel will consume/populate the extended address data.

Includes a fix for an uninitialised var:
Reported-by: kernel test robot <lkp@intel.com>

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 0b93aed2 14-Oct-2021 Matt Johnston <matt@codeconstruct.com.au>

mctp: Avoid leak of mctp_sk_key

mctp_key_alloc() returns a key already referenced.

The mctp_route_input() path receives a packet for a bind socket and
allocates a key. It passes the key to mctp_key_add() which takes a
refcount and adds the key to lists. mctp_route_input() should then
release its own refcount when setting the key pointer to NULL.

In the mctp_alloc_local_tag() path (for mctp_local_output()) we
similarly need to unref the key before returning (mctp_reserve_tag()
takes a refcount and adds the key to lists).

Fixes: 73c618456dc5 ("mctp: locking, lifetime and validity changes for sk_keys")
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Reviewed-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 161eba50 02-Oct-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Add initial test structure and fragmentation test

This change adds the first kunit test for the mctp subsystem, and an
initial test for the fragmentation path.

We're adding tests under a new net/mctp/test/ directory.

Incorporates a fix for module configs:

Reported-by: kernel test robot <lkp@intel.com>

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# b022f886 01-Oct-2021 David S. Miller <davem@davemloft.net>

Revert "Merge branch 'mctp-kunit-tests'"

This reverts commit 4f42ad2011d2fcbd89f5cdf56121271a8cd5ee5d, reversing
changes made to ea2dd331bfaaeba74ba31facf437c29044f7d4cb.

These chanfges break the build when mctp is modular.

Signed-off-by: David S. Miller <davem@davemloft.net>


# 8c02066b 01-Oct-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Add initial test structure and fragmentation test

This change adds the first kunit test for the mctp subsystem, and an
initial test for the fragmentation path.

We're adding tests under a new net/mctp/test/ directory.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 6183569d 29-Sep-2021 Matt Johnston <matt@codeconstruct.com.au>

mctp: Set route MTU via netlink

A route's RTAX_MTU can be set in nested RTAX_METRICS

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4f9e1ba6 29-Sep-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Add tracepoints for tag/key handling

The tag allocation, release and bind events are somewhat opaque outside
the kernel; this change adds a few tracepoints to assist in
instrumentation and debugging.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 7b14e15a 29-Sep-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Implement a timeout for tags

Currently, a MCTP (local-eid,remote-eid,tag) tuple is allocated to a
socket on send, and only expires when the socket is closed.

This change introduces a tag timeout, freeing the tuple after a fixed
expiry - currently six seconds. This is greater than (but close to) the
max response timeout in upper-layer bindings.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 43f55f23 29-Sep-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Add refcounts to mctp_dev

Currently, we tie the struct mctp_dev lifetime to the underlying struct
net_device, and hold/put that device as a proxy for a separate mctp_dev
refcount. This works because we're not holding any references to the
mctp_dev that are different from the netdev lifetime.

In a future change we'll break that assumption though, as we'll need to
hold mctp_dev references in a workqueue, which might live past the
netdev unregister notification.

In order to support that, this change introduces a refcount on the
mctp_dev, currently taken by the net_device->mctp_ptr reference, and
released on netdev unregister events. We can then use this for future
references that might outlast the net device.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 73c61845 29-Sep-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: locking, lifetime and validity changes for sk_keys

We will want to invalidate sk_keys in a future change, which will
require a boolean flag to mark invalidated items in the socket & net
namespace lists. We'll also need to take a reference to keys, held over
non-atomic contexts, so we need a refcount on keys also.

This change adds a validity flag (currently always true) and refcount to
struct mctp_sk_key. With a refcount on the keys, using RCU no longer
makes much sense; we have exact indications on the lifetime of keys. So,
we also change the RCU list traversal to a locked implementation.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 1f6c77ac 29-Sep-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Allow local delivery to the null EID

We may need to receive packets addressed to the null EID (==0), but
addressed to us at the physical layer.

This change adds a lookup for local routes when we see a packet
addressed to EID 0, and a local phys address.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# f364dd71 29-Sep-2021 Matt Johnston <matt@codeconstruct.com.au>

mctp: Allow MCTP on tun devices

Allowing TUN is useful for testing, to route packets to userspace or to
tunnel between machines.

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 581edcd0 07-Sep-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: perform route destruction under RCU read lock

The kernel test robot reports:

[ 843.509974][ T345] =============================
[ 843.524220][ T345] WARNING: suspicious RCU usage
[ 843.538791][ T345] 5.14.0-rc2-00606-g889b7da23abf #1 Not tainted
[ 843.553617][ T345] -----------------------------
[ 843.567412][ T345] net/mctp/route.c:310 RCU-list traversed in non-reader section!!

- we're missing the rcu read lock acquire around the destruction path.

This change adds the acquire/release - the path is already atomic, and
we're using the _rcu list iterators.

Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 83f0a0b7 09-Aug-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Specify route types, require rtm_type in RTM_*ROUTE messages

This change adds a 'type' attribute to routes, which can be parsed from
a RTM_NEWROUTE message. This will help to distinguish local vs. peer
routes in a future change.

This means userspace will need to set a correct rtm_type in RTM_NEWROUTE
and RTM_DELROUTE messages; we currently only accept RTN_UNICAST.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://lore.kernel.org/r/20210810023834.2231088-1-jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>


# 03f2bbc4 28-Jul-2021 Matt Johnston <matt@codeconstruct.com.au>

mctp: Allow per-netns default networks

Currently we have a compile-time default network
(MCTP_INITIAL_DEFAULT_NET). This change introduces a default_net field
on the net namespace, allowing future configuration for new interfaces.

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 26ab3fca 28-Jul-2021 Matt Johnston <matt@codeconstruct.com.au>

mctp: Add dest neighbour lladdr to route output

Now that we have a neighbour implementation, hook it up to the output
path to set the dest hardware address for outgoing packets.

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 4a992bbd 28-Jul-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Implement message fragmentation & reassembly

This change implements MCTP fragmentation (based on route & device MTU),
and corresponding reassembly.

The MCTP specification only allows for fragmentation on the originating
message endpoint, and reassembly on the destination endpoint -
intermediate nodes do not need to reassemble/refragment. Consequently,
we only fragment in the local transmit path, and reassemble
locally-bound packets. Messages are required to be in-order, so we
simply cancel reassembly on out-of-order or missing packets.

In the fragmentation path, we just break up the message into MTU-sized
fragments; the skb structure is a simple copy for now, which we can later
improve with a shared data implementation.

For reassembly, we keep track of incoming message fragments using the
existing tag infrastructure, allocating a key on the (src,dest,tag)
tuple, and reassembles matching fragments into a skb->frag_list.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 833ef3b9 28-Jul-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Populate socket implementation

Start filling-out the socket syscalls: bind, sendmsg & recvmsg.

This requires an input route implementation, so we add to
mctp_route_input, allowing lookups on binds & message tags. This just
handles single-packet messages at present, we will add fragmentation in
a future change.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 06d2f4c5 28-Jul-2021 Matt Johnston <matt@codeconstruct.com.au>

mctp: Add netlink route management

This change adds RTM_GETROUTE, RTM_NEWROUTE & RTM_DELROUTE handlers,
allowing management of the MCTP route table.

Includes changes from Jeremy Kerr <jk@codeconstruct.com.au>.

Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>


# 889b7da2 28-Jul-2021 Jeremy Kerr <jk@codeconstruct.com.au>

mctp: Add initial routing framework

Add a simple routing table, and a couple of route output handlers, and
the mctp packet_type & handler.

Includes changes from Matt Johnston <matt@codeconstruct.com.au>.

Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>