History log of /freebsd-current/sys/net/route/route_helpers.c
Revision Date Author Comments
# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 4d846d26 10-May-2023 Warner Losh <imp@FreeBSD.org>

spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD

The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix


# 177f04d5 29-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: constantify @rc in rib_decompose_notification().

Clarify the @rc immutability by explicitly marking @rc const.

MFC after: 2 weeks


# 578a99c9 29-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: improve multiline debug

Add IF_DEBUG_LEVEL() macro to ensure all debug output preparation
is run only if the current debug level is sufficient. Consistently
use it within routing subsystem.

MFC after: 2 weeks


# d8b26934 09-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: add rib_add_default_route() wrapper

Multiple consumers in the kernel space want to install IPv4 or IPv6
default route. Provide convenient wrapper to simplify the code
inside the customers.

MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D36167


# 2ce55385 04-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: add rib_<add|del>_route_px() functions operating with nexthops.

This change adds public KPI to work with routes using pre-created
nexthops, instead of using data from addrinfo structures. These
functions will be later used for adding/deleting kernel-originated
routes and upcoming netlink protocol.

As a part of providing this KPI, low-level route addition code has been
reworked to provide more control over route creation or change.
Specifically, a number of operation flags
(RTM_F_<CREATE|EXCL|REPLACE|APPEND>) have been added, defining the
desired behaviour the the route already exists (or not exists). This
change required some changes in the multipath addition code, resulting
in moving this code to route_ctl.c, rendering mpath_ctl.c empty.

Differential Revision: https://reviews.freebsd.org/D36073
MFC after: 1 month


# 412bdb5a 03-Aug-2022 Mateusz Guzik <mjg@FreeBSD.org>

route: fix NOIP builds

Sponsored by: Rubicon Communications, LLC ("Netgate")


# ae6bfd12 01-Aug-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: refactor private KPI
* Make nhgrp_get_nhops() return const struct weightened_nhop to
indicate that the list is immutable
* Make nhgrp_get_group() return the actual group, instead of
group+weight.

MFC after: 2 weeks


# 5c23343b 29-Jul-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: convert remnants of DPRINTF to FIB_CTL_LOG().

Convert the last remaining pieces of old-style debug messages
to the new debugging framework.

Differential Revision: https://reviews.freebsd.org/D35994
MFC after: 2 weeks


# 27f107e1 31-Jul-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: add debug printing helpers for rtentry and RTM* cmds.

MFC after: 2 weeks


# db4b4021 04-Jul-2022 Mateusz Guzik <mjg@FreeBSD.org>

routing: hide notify_add and notify_del behind ROUTE_MPATH

Fixes a warn about unused routines without the option.

Sponsored by: Rubicon Communications, LLC ("Netgate")


# 8010b7a7 27-Jun-2022 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: simplify decompose_change_notification().

The function's goal is to compare old/new nhop/nexthop group for the route
and decompose it into the series of RTM_ADD/RTM_DELETE single-nhop
events, calling specified callback for each event.
Simplify it by properly leveraging the fact that both old/new groups
are sorted nhop-# ascending.

Tested by: Claudio Jeker<claudio.jeker@klarasystems.com>
Differential Revision: https://reviews.freebsd.org/D35598
MFC after: 2 weeks


# f84c3010 30-Aug-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: Fix newly-added rt_get_inet[6]_parent() api.

Correctly handle the case when no default route is present.

Reported by: Konrad <konrad.kreciwilk at korbank.pl>


# 36e15b71 15-Aug-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

routing: Fix crashes with dpdk_lpm[46] algo.

When a prefix gets deleted from the RIB, dpdk_lpm algo needs to know
the nexthop of the "parent" prefix to update its internal state.
The glue code, which utilises RIB as a backing route store, uses
fib[46]_lookup_rt() for the prefix destination after its deletion
to fetch the desired nexthop.
This approach does not work when deleting less-specific prefixes
with most-specific ones are still present. For example, if
10.0.0.0/24, 10.0.0.0/23 and 10.0.0.0/22 exist in RIB, deleting
10.0.0.0/23 would result in 10.0.0.0/24 being returned as a search
result instead of 10.0.0.0/22. This, in turn, results in the failed
datastructure update: part of the deleted /23 prefix will still
contain the reference to an old nexthop. This leads to the
use-after-free behaviour, ending with the eventual crashes.

Fix the logic flaw by properly fetching the prefix "parent" via
newly-created rt_get_inet[6]_parent() helpers.

Differential Revision: https://reviews.freebsd.org/D31546
PR: 256882,256833
MFC after: 1 week


# f9668e42 25-Apr-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add rib_walk_from() wrapper for selective rib tree traversal.

Provide wrapper for the rnh_walktree_from() rib callback.
As currently `struct rib_head` is considered internal to the
routing subsystem, this wrapper is necessary to maintain isolation
from the external code.

Differential Revision: https://reviews.freebsd.org/D29971
MFC after: 1 week


# 151ec796 30-Jan-2021 Alexander V. Chernikov <melifaro@FreeBSD.org>

Fix the design problem with delayed algorithm sync.

Currently, if the immutable algorithm like bsearch or radix_lockless
receives rtable update notification, it schedules algorithm rebuild.
This rebuild is executed by the callout after ~50 milliseconds.

It is possible that a script adding an interface address and than route
with the gateway bound to that address will fail. It can happen due
to the fact that fib is not updated by the time the route addition
request arrives.

Fix this by allowing synchronous algorithm rebuilds based on certain
conditions. By default, these conditions assume:
1) less than net.route.algo.fib_sync_limit=100 routes
2) routes without gateway.

* Move algo instance build entirely under rib WLOCK.
Rib lock is only used for control plane (except radix algo, but there
are no rebuilds).
* Add rib_walk_ext_locked() function to allow RIB iteration with
rib lock already held.
* Fix rare potential callout use-after-free for fds by binding fd
callout to the relevant rib rmlock. In that case, callout_stop()
under rib WLOCK guarantees no callout will be executed afterwards.

MFC after: 3 days


# 3b1654cb 29-Nov-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Introduce rib_walk_ext_internal() to allow iteration with rnh pointer.

This solves the case when rib is not yet attached/detached to/from the
system rib array.

Differential Revision: https://reviews.freebsd.org/D27406


# 7511a638 22-Nov-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Refactor rib iterator functions.

* Make rib_walk() order of arguments consistent with the rest of RIB api
* Add rib_walk_ext() allowing to exec callback before/after iteration.
* Rename rt_foreach_fib_walk_del -> rib_foreach_table_walk_del
* Rename rt_forach_fib_walk -> rib_foreach_table_walk
* Move rib_foreach_table_walk{_del} to route/route_helpers.c
* Slightly refactor rib_foreach_table_walk{_del} to make the implementation
consistent and prepare for upcoming iterator optimizations.

Differential Revision: https://reviews.freebsd.org/D27219


# fedeb08b 03-Oct-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Introduce scalable route multipath.

This change is based on the nexthop objects landed in D24232.

The change introduces the concept of nexthop groups.
Each group contains the collection of nexthops with their
relative weights and a dataplane-optimized structure to enable
efficient nexthop selection.

Simular to the nexthops, nexthop groups are immutable. Dataplane part
gets compiled during group creation and is basically an array of
nexthop pointers, compiled w.r.t their weights.

With this change, `rt_nhop` field of `struct rtentry` contains either
nexthop or nexthop group. They are distinguished by the presense of
NHF_MULTIPATH flag.
All dataplane lookup functions returns pointer to the nexthop object,
leaving nexhop groups details inside routing subsystem.

User-visible changes:

The change is intended to be backward-compatible: all non-mpath operations
should work as before with ROUTE_MPATH and net.route.multipath=1.

All routes now comes with weight, default weight is 1, maximum is 2^24-1.

Current maximum multipath group width is statically set to 64.
This will become sysctl-tunable in the followup changes.

Using functionality:
* Recompile kernel with ROUTE_MPATH
* set net.route.multipath to 1

route add -6 2001:db8::/32 2001:db8::2 -weight 10
route add -6 2001:db8::/32 2001:db8::3 -weight 20

netstat -6On

Nexthop groups data

Internet6:
GrpIdx NhIdx Weight Slots Gateway Netif Refcnt
1 ------- ------- ------- --------------------------------------- --------- 1
13 10 1 2001:db8::2 vlan2
14 20 2 2001:db8::3 vlan2

Next steps:
* Land outbound hashing for locally-originated routes ( D26523 ).
* Fix net/bird multipath (net/frr seems to work fine)
* Add ROUTE_MPATH to GENERIC
* Set net.route.multipath=1 by default

Tested by: olivier
Reviewed by: glebius
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D26449


# 662c1305 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

net: clean up empty lines in .c and .h files


# a624ca3d 28-Aug-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Move net/route/shared.h definitions to net/route/route_var.h.

No functional changes.

net/route/shared.h was created in the inital phases of nexthop conversion.
It was intended to serve the same purpose as route_var.h - share definitions
of functions and structures between the routing subsystem components. At
that time route_var.h was included by many files external to the routing
subsystem, which largerly defeats its purpose.

As currently this is not the case anymore and amount of route_var.h includes
is roughly the same as shared.h, retire the latter in favour of the former.


# da187ddb 01-Jun-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add rib_<add|del|change>_route() functions to manipulate the routing table.

The main driver for the change is the need to improve notification mechanism.
Currently callers guess the operation data based on the rtentry structure
returned in case of successful operation result. There are two problems with
this appoach. First is that it doesn't provide enough information for the
upcoming multipath changes, where rtentry refers to a new nexthop group,
and there is no way of guessing which paths were added during the change.
Second is that some rtentry fields can change during notification and
protecting from it by requiring customers to unlock rtentry is not desired.

Additionally, as the consumers such as rtsock do know which operation they
request in advance, making explicit add/change/del versions of the functions
makes sense, especially given the functions don't share a lot of code.

With that in mind, introduce rib_cmd_info notification structure and
rib_<add|del|change>_route() functions, with mandatory rib_cmd_info pointer.
It will be used in upcoming generalized notifications.

* Move definitions of the new functions and some other functions/structures
used for the routing table manipulation to a separate header file,
net/route/route_ctl.h. net/route.h is a frequently used file included in
~140 places in kernel, and 90% of the users don't need these definitions.

Reviewed by: ae
Differential Revision: https://reviews.freebsd.org/D25067


# e7403d02 01-Jun-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Revert r361704, it accidentally committed merged D25067 and D25070.


# 79674562 01-Jun-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

* Add rib_<add|del|change>_route() functions to manipulate the routing table.

The main driver for the change is the need to improve notification mechanism.
Currently callers guess the operation data based on the rtentry structure
returned in case of successful operation result. There are two problems with
this appoach. First is that it doesn't provide enough information for the
upcoming multipath changes, where rtentry refers to a new nexthop group,
and there is no way of guessing which paths were added during the change.
Second is that some rtentry fields can change during notification and
protecting from it by requiring customers to unlock rtentry is not desired.

Additionally, as the consumers such as rtsock do know which operation they
request in advance, making explicit add/change/del versions of the functions
makes sense, especially given the functions don't share a lot of code.

With that in mind, introduce rib_cmd_info notification structure and
rib_<add|del|change>_route() functions, with mandatory rib_cmd_info pointer.
It will be used in upcoming generalized notifications.

* Move definitions of the new functions and some other functions/structures
used for the routing table manipulation to a separate header file,
net/route/route_ctl.h. net/route.h is a frequently used file included in
~140 places in kernel, and 90% of the users don't need these definitions.

Reviewed by: ae
Differential Revision: https://reviews.freebsd.org/D25067


# 682b902d 07-May-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Add rib_lookup() sockaddr lookup wrapper and make ifa_ifwithroute use it.

Create rib_lookup() wrapper around per-af dataplane lookup functions.
This will help in the cases of having control plane af-agnostic code.

Switch ifa_ifwithroute() to use this function instead of rtalloc1().

Reviewed by: ae
Differential Revision: https://reviews.freebsd.org/D24731


# e7d8af4f 28-Apr-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Move route_temporal.c and route_var.h to net/route.

Nexthop objects implementation, defined in r359823,
introduced sys/net/route directory intended to hold all
routing-related code. Move recently-introduced route_temporal.c and
private route_var.h header there.

Differential Revision: https://reviews.freebsd.org/D24597


# a6663252 12-Apr-2020 Alexander V. Chernikov <melifaro@FreeBSD.org>

Introduce nexthop objects and new routing KPI.

This is the foundational change for the routing subsytem rearchitecture.
More details and goals are available in https://reviews.freebsd.org/D24141 .

This patch introduces concept of nexthop objects and new nexthop-based
routing KPI.

Nexthops are objects, containing all necessary information for performing
the packet output decision. Output interface, mtu, flags, gw address goes
there. For most of the cases, these objects will serve the same role as
the struct rtentry is currently serving.
Typically there will be low tens of such objects for the router even with
multiple BGP full-views, as these objects will be shared between routing
entries. This allows to store more information in the nexthop.

New KPI:

struct nhop_object *fib4_lookup(uint32_t fibnum, struct in_addr dst,
uint32_t scopeid, uint32_t flags, uint32_t flowid);
struct nhop_object *fib6_lookup(uint32_t fibnum, const struct in6_addr *dst6,
uint32_t scopeid, uint32_t flags, uint32_t flowid);

These 2 function are intended to replace all all flavours of
<in_|in6_>rtalloc[1]<_ign><_fib>, mpath functions and the previous
fib[46]-generation functions.

Upon successful lookup, they return nexthop object which is guaranteed to
exist within current NET_EPOCH. If longer lifetime is desired, one can
specify NHR_REF as a flag and get a referenced version of the nexthop.
Reference semantic closely resembles rtentry one, allowing sed-style conversion.

Additionally, another 2 functions are introduced to support uRPF functionality
inside variety of our firewalls. Their primary goal is to hide the multipath
implementation details inside the routing subsystem, greatly simplifying
firewalls implementation:

int fib4_lookup_urpf(uint32_t fibnum, struct in_addr dst, uint32_t scopeid,
uint32_t flags, const struct ifnet *src_if);
int fib6_lookup_urpf(uint32_t fibnum, const struct in6_addr *dst6, uint32_t scopeid,
uint32_t flags, const struct ifnet *src_if);

All functions have a separate scopeid argument, paving way to eliminating IPv6 scope
embedding and allowing to support IPv4 link-locals in the future.

Structure changes:
* rtentry gets new 'rt_nhop' pointer, slightly growing the overall size.
* rib_head gets new 'rnh_preadd' callback pointer, slightly growing overall sz.

Old KPI:
During the transition state old and new KPI will coexists. As there are another 4-5
decent-sized conversion patches, it will probably take a couple of weeks.
To support both KPIs, fields not required by the new KPI (most of rtentry) has to be
kept, resulting in the temporary size increase.
Once conversion is finished, rtentry will notably shrink.

More details:
* architectural overview: https://reviews.freebsd.org/D24141
* list of the next changes: https://reviews.freebsd.org/D24232

Reviewed by: ae,glebius(initial version)
Differential Revision: https://reviews.freebsd.org/D24232