History log of /openbsd-current/usr.sbin/bgpd/bgpd.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.264 15-May-2024 job

Mark RTR and IPv6 BGP packets with DSCP CS6 (network control)

Additionally, set TCP_NODELAY on the RTR socket, there is no need to
queue up messages towards the RTR server.

OK claudio@


# 1.263 09-Apr-2024 claudio

Check that the ASPA tas array fits in an IMSG before sending the ASPA
record over to RTR or the RDE.

The long term goal is to increase the IMSG size considerably but that
requires some additional API changes to the imsg API.
OK tb@


Revision tags: OPENBSD_7_5_BASE
# 1.262 09-Jan-2024 claudio

Convert the parent process imsg handling over to the new imsg API.

This simplifies the code a fair bit and removes direct unchecked memory
access to imsg.data.
OK tb@


# 1.261 04-Jan-2024 claudio

Rename argument roa of imsg_send_sockets() to rtr since the imsgbuf is
for PROC_RTR.


# 1.260 07-Nov-2023 claudio

Rename struct imsgbuf *ibuf to *imsgbuf in all places.
ibuf should be reserved for struct ibuf * values.
OK tb@


Revision tags: OPENBSD_7_4_BASE
# 1.259 16-Aug-2023 claudio

Remove per-AFI ASPA handling in bgpd internals

With draft-ietf-sidrops-aspa-profile-16 and
draft-ietf-sidrops-aspa-verification-15 the AFI dependence of ASPA
records was dropped. So remove this complication form the code.

This only removes the AFI handling internally in bgpd but still allows
the old syntax in aspa-set tables. The optional address family is just
ignored and records are merged together.

For RTR sessions draft-ietf-sidrops-8210bis has not yet been updated so
right now we still handle RTR sessions as specified there. The IPv4 and
IPv6 ASPA entries are handled in two trees and merged together into one
AFI independent tree. This is the best we can do for now until IETF
updates draft-ietf-sidrops-8210bis.

OK tb@ job@


# 1.258 19-Apr-2023 claudio

Implement code to pass the flowspec config over to the RDE. The parent
process tracks which prefixes are added / removed and issues the
corresponding imsg calls.
Right now the RDE does nothing with the received information.
OK tb@


Revision tags: OPENBSD_7_3_BASE
# 1.257 14-Feb-2023 claudio

No longer wait for the RTR process to finish the config reload before
sending the IMSG_RECONF_DONE message to the RDE. The RDE does not depend
on the RTR config reload (in contrast to the SE).
The ROA / ASPA reload is async from the RDE config reload.
OK tb@


# 1.256 20-Jan-2023 claudio

comma space not space comma


# 1.255 18-Nov-2022 claudio

Add plumbing for ASPA support. This implements the parser and part of the
logic in the rtr process. It does not implement the new RTR messages yet
but it is possible to specify an aspa-set in the config. Also the validation
code in the RDE is missing so this does not do anything.
With this in it will be possible to extend rpki-client to publish an
aspa-set as part of the openbgpd config file.
OK tb@


Revision tags: OPENBSD_7_2_BASE
# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.263 09-Apr-2024 claudio

Check that the ASPA tas array fits in an IMSG before sending the ASPA
record over to RTR or the RDE.

The long term goal is to increase the IMSG size considerably but that
requires some additional API changes to the imsg API.
OK tb@


Revision tags: OPENBSD_7_5_BASE
# 1.262 09-Jan-2024 claudio

Convert the parent process imsg handling over to the new imsg API.

This simplifies the code a fair bit and removes direct unchecked memory
access to imsg.data.
OK tb@


# 1.261 04-Jan-2024 claudio

Rename argument roa of imsg_send_sockets() to rtr since the imsgbuf is
for PROC_RTR.


# 1.260 07-Nov-2023 claudio

Rename struct imsgbuf *ibuf to *imsgbuf in all places.
ibuf should be reserved for struct ibuf * values.
OK tb@


Revision tags: OPENBSD_7_4_BASE
# 1.259 16-Aug-2023 claudio

Remove per-AFI ASPA handling in bgpd internals

With draft-ietf-sidrops-aspa-profile-16 and
draft-ietf-sidrops-aspa-verification-15 the AFI dependence of ASPA
records was dropped. So remove this complication form the code.

This only removes the AFI handling internally in bgpd but still allows
the old syntax in aspa-set tables. The optional address family is just
ignored and records are merged together.

For RTR sessions draft-ietf-sidrops-8210bis has not yet been updated so
right now we still handle RTR sessions as specified there. The IPv4 and
IPv6 ASPA entries are handled in two trees and merged together into one
AFI independent tree. This is the best we can do for now until IETF
updates draft-ietf-sidrops-8210bis.

OK tb@ job@


# 1.258 19-Apr-2023 claudio

Implement code to pass the flowspec config over to the RDE. The parent
process tracks which prefixes are added / removed and issues the
corresponding imsg calls.
Right now the RDE does nothing with the received information.
OK tb@


Revision tags: OPENBSD_7_3_BASE
# 1.257 14-Feb-2023 claudio

No longer wait for the RTR process to finish the config reload before
sending the IMSG_RECONF_DONE message to the RDE. The RDE does not depend
on the RTR config reload (in contrast to the SE).
The ROA / ASPA reload is async from the RDE config reload.
OK tb@


# 1.256 20-Jan-2023 claudio

comma space not space comma


# 1.255 18-Nov-2022 claudio

Add plumbing for ASPA support. This implements the parser and part of the
logic in the rtr process. It does not implement the new RTR messages yet
but it is possible to specify an aspa-set in the config. Also the validation
code in the RDE is missing so this does not do anything.
With this in it will be possible to extend rpki-client to publish an
aspa-set as part of the openbgpd config file.
OK tb@


Revision tags: OPENBSD_7_2_BASE
# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.262 09-Jan-2024 claudio

Convert the parent process imsg handling over to the new imsg API.

This simplifies the code a fair bit and removes direct unchecked memory
access to imsg.data.
OK tb@


# 1.261 04-Jan-2024 claudio

Rename argument roa of imsg_send_sockets() to rtr since the imsgbuf is
for PROC_RTR.


# 1.260 07-Nov-2023 claudio

Rename struct imsgbuf *ibuf to *imsgbuf in all places.
ibuf should be reserved for struct ibuf * values.
OK tb@


Revision tags: OPENBSD_7_4_BASE
# 1.259 16-Aug-2023 claudio

Remove per-AFI ASPA handling in bgpd internals

With draft-ietf-sidrops-aspa-profile-16 and
draft-ietf-sidrops-aspa-verification-15 the AFI dependence of ASPA
records was dropped. So remove this complication form the code.

This only removes the AFI handling internally in bgpd but still allows
the old syntax in aspa-set tables. The optional address family is just
ignored and records are merged together.

For RTR sessions draft-ietf-sidrops-8210bis has not yet been updated so
right now we still handle RTR sessions as specified there. The IPv4 and
IPv6 ASPA entries are handled in two trees and merged together into one
AFI independent tree. This is the best we can do for now until IETF
updates draft-ietf-sidrops-8210bis.

OK tb@ job@


# 1.258 19-Apr-2023 claudio

Implement code to pass the flowspec config over to the RDE. The parent
process tracks which prefixes are added / removed and issues the
corresponding imsg calls.
Right now the RDE does nothing with the received information.
OK tb@


Revision tags: OPENBSD_7_3_BASE
# 1.257 14-Feb-2023 claudio

No longer wait for the RTR process to finish the config reload before
sending the IMSG_RECONF_DONE message to the RDE. The RDE does not depend
on the RTR config reload (in contrast to the SE).
The ROA / ASPA reload is async from the RDE config reload.
OK tb@


# 1.256 20-Jan-2023 claudio

comma space not space comma


# 1.255 18-Nov-2022 claudio

Add plumbing for ASPA support. This implements the parser and part of the
logic in the rtr process. It does not implement the new RTR messages yet
but it is possible to specify an aspa-set in the config. Also the validation
code in the RDE is missing so this does not do anything.
With this in it will be possible to extend rpki-client to publish an
aspa-set as part of the openbgpd config file.
OK tb@


Revision tags: OPENBSD_7_2_BASE
# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.261 04-Jan-2024 claudio

Rename argument roa of imsg_send_sockets() to rtr since the imsgbuf is
for PROC_RTR.


# 1.260 07-Nov-2023 claudio

Rename struct imsgbuf *ibuf to *imsgbuf in all places.
ibuf should be reserved for struct ibuf * values.
OK tb@


Revision tags: OPENBSD_7_4_BASE
# 1.259 16-Aug-2023 claudio

Remove per-AFI ASPA handling in bgpd internals

With draft-ietf-sidrops-aspa-profile-16 and
draft-ietf-sidrops-aspa-verification-15 the AFI dependence of ASPA
records was dropped. So remove this complication form the code.

This only removes the AFI handling internally in bgpd but still allows
the old syntax in aspa-set tables. The optional address family is just
ignored and records are merged together.

For RTR sessions draft-ietf-sidrops-8210bis has not yet been updated so
right now we still handle RTR sessions as specified there. The IPv4 and
IPv6 ASPA entries are handled in two trees and merged together into one
AFI independent tree. This is the best we can do for now until IETF
updates draft-ietf-sidrops-8210bis.

OK tb@ job@


# 1.258 19-Apr-2023 claudio

Implement code to pass the flowspec config over to the RDE. The parent
process tracks which prefixes are added / removed and issues the
corresponding imsg calls.
Right now the RDE does nothing with the received information.
OK tb@


Revision tags: OPENBSD_7_3_BASE
# 1.257 14-Feb-2023 claudio

No longer wait for the RTR process to finish the config reload before
sending the IMSG_RECONF_DONE message to the RDE. The RDE does not depend
on the RTR config reload (in contrast to the SE).
The ROA / ASPA reload is async from the RDE config reload.
OK tb@


# 1.256 20-Jan-2023 claudio

comma space not space comma


# 1.255 18-Nov-2022 claudio

Add plumbing for ASPA support. This implements the parser and part of the
logic in the rtr process. It does not implement the new RTR messages yet
but it is possible to specify an aspa-set in the config. Also the validation
code in the RDE is missing so this does not do anything.
With this in it will be possible to extend rpki-client to publish an
aspa-set as part of the openbgpd config file.
OK tb@


Revision tags: OPENBSD_7_2_BASE
# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.260 07-Nov-2023 claudio

Rename struct imsgbuf *ibuf to *imsgbuf in all places.
ibuf should be reserved for struct ibuf * values.
OK tb@


Revision tags: OPENBSD_7_4_BASE
# 1.259 16-Aug-2023 claudio

Remove per-AFI ASPA handling in bgpd internals

With draft-ietf-sidrops-aspa-profile-16 and
draft-ietf-sidrops-aspa-verification-15 the AFI dependence of ASPA
records was dropped. So remove this complication form the code.

This only removes the AFI handling internally in bgpd but still allows
the old syntax in aspa-set tables. The optional address family is just
ignored and records are merged together.

For RTR sessions draft-ietf-sidrops-8210bis has not yet been updated so
right now we still handle RTR sessions as specified there. The IPv4 and
IPv6 ASPA entries are handled in two trees and merged together into one
AFI independent tree. This is the best we can do for now until IETF
updates draft-ietf-sidrops-8210bis.

OK tb@ job@


# 1.258 19-Apr-2023 claudio

Implement code to pass the flowspec config over to the RDE. The parent
process tracks which prefixes are added / removed and issues the
corresponding imsg calls.
Right now the RDE does nothing with the received information.
OK tb@


Revision tags: OPENBSD_7_3_BASE
# 1.257 14-Feb-2023 claudio

No longer wait for the RTR process to finish the config reload before
sending the IMSG_RECONF_DONE message to the RDE. The RDE does not depend
on the RTR config reload (in contrast to the SE).
The ROA / ASPA reload is async from the RDE config reload.
OK tb@


# 1.256 20-Jan-2023 claudio

comma space not space comma


# 1.255 18-Nov-2022 claudio

Add plumbing for ASPA support. This implements the parser and part of the
logic in the rtr process. It does not implement the new RTR messages yet
but it is possible to specify an aspa-set in the config. Also the validation
code in the RDE is missing so this does not do anything.
With this in it will be possible to extend rpki-client to publish an
aspa-set as part of the openbgpd config file.
OK tb@


Revision tags: OPENBSD_7_2_BASE
# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.259 16-Aug-2023 claudio

Remove per-AFI ASPA handling in bgpd internals

With draft-ietf-sidrops-aspa-profile-16 and
draft-ietf-sidrops-aspa-verification-15 the AFI dependence of ASPA
records was dropped. So remove this complication form the code.

This only removes the AFI handling internally in bgpd but still allows
the old syntax in aspa-set tables. The optional address family is just
ignored and records are merged together.

For RTR sessions draft-ietf-sidrops-8210bis has not yet been updated so
right now we still handle RTR sessions as specified there. The IPv4 and
IPv6 ASPA entries are handled in two trees and merged together into one
AFI independent tree. This is the best we can do for now until IETF
updates draft-ietf-sidrops-8210bis.

OK tb@ job@


# 1.258 19-Apr-2023 claudio

Implement code to pass the flowspec config over to the RDE. The parent
process tracks which prefixes are added / removed and issues the
corresponding imsg calls.
Right now the RDE does nothing with the received information.
OK tb@


Revision tags: OPENBSD_7_3_BASE
# 1.257 14-Feb-2023 claudio

No longer wait for the RTR process to finish the config reload before
sending the IMSG_RECONF_DONE message to the RDE. The RDE does not depend
on the RTR config reload (in contrast to the SE).
The ROA / ASPA reload is async from the RDE config reload.
OK tb@


# 1.256 20-Jan-2023 claudio

comma space not space comma


# 1.255 18-Nov-2022 claudio

Add plumbing for ASPA support. This implements the parser and part of the
logic in the rtr process. It does not implement the new RTR messages yet
but it is possible to specify an aspa-set in the config. Also the validation
code in the RDE is missing so this does not do anything.
With this in it will be possible to extend rpki-client to publish an
aspa-set as part of the openbgpd config file.
OK tb@


Revision tags: OPENBSD_7_2_BASE
# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.258 19-Apr-2023 claudio

Implement code to pass the flowspec config over to the RDE. The parent
process tracks which prefixes are added / removed and issues the
corresponding imsg calls.
Right now the RDE does nothing with the received information.
OK tb@


Revision tags: OPENBSD_7_3_BASE
# 1.257 14-Feb-2023 claudio

No longer wait for the RTR process to finish the config reload before
sending the IMSG_RECONF_DONE message to the RDE. The RDE does not depend
on the RTR config reload (in contrast to the SE).
The ROA / ASPA reload is async from the RDE config reload.
OK tb@


# 1.256 20-Jan-2023 claudio

comma space not space comma


# 1.255 18-Nov-2022 claudio

Add plumbing for ASPA support. This implements the parser and part of the
logic in the rtr process. It does not implement the new RTR messages yet
but it is possible to specify an aspa-set in the config. Also the validation
code in the RDE is missing so this does not do anything.
With this in it will be possible to extend rpki-client to publish an
aspa-set as part of the openbgpd config file.
OK tb@


Revision tags: OPENBSD_7_2_BASE
# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.257 14-Feb-2023 claudio

No longer wait for the RTR process to finish the config reload before
sending the IMSG_RECONF_DONE message to the RDE. The RDE does not depend
on the RTR config reload (in contrast to the SE).
The ROA / ASPA reload is async from the RDE config reload.
OK tb@


# 1.256 20-Jan-2023 claudio

comma space not space comma


# 1.255 18-Nov-2022 claudio

Add plumbing for ASPA support. This implements the parser and part of the
logic in the rtr process. It does not implement the new RTR messages yet
but it is possible to specify an aspa-set in the config. Also the validation
code in the RDE is missing so this does not do anything.
With this in it will be possible to extend rpki-client to publish an
aspa-set as part of the openbgpd config file.
OK tb@


Revision tags: OPENBSD_7_2_BASE
# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.256 20-Jan-2023 claudio

comma space not space comma


# 1.255 18-Nov-2022 claudio

Add plumbing for ASPA support. This implements the parser and part of the
logic in the rtr process. It does not implement the new RTR messages yet
but it is possible to specify an aspa-set in the config. Also the validation
code in the RDE is missing so this does not do anything.
With this in it will be possible to extend rpki-client to publish an
aspa-set as part of the openbgpd config file.
OK tb@


Revision tags: OPENBSD_7_2_BASE
# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.255 18-Nov-2022 claudio

Add plumbing for ASPA support. This implements the parser and part of the
logic in the rtr process. It does not implement the new RTR messages yet
but it is possible to specify an aspa-set in the config. Also the validation
code in the RDE is missing so this does not do anything.
With this in it will be possible to extend rpki-client to publish an
aspa-set as part of the openbgpd config file.
OK tb@


Revision tags: OPENBSD_7_2_BASE
# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.254 17-Aug-2022 claudio

Convert bzero() to memset(), bcmp() to memcmp() and bcopy() to memcpy().

The memory regions passed to memcpy() can not overlap so no need for memmove().
OK tb@ deraadt@


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.253 28-Jul-2022 deraadt

whitespace found during a read-thru; ok claudio


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.252 23-Jul-2022 claudio

Refactor and rename bgpd_filternexthop() to bgpd_oknexthop()

Simplify the logic and adjust kroute_match() which makes the code
easier to understand.
OK tb@


# 1.251 22-Jul-2022 claudio

Revert previous commit. The RTP_MINE checks on struct kroute_full are
not correct because kr_tofull() replaces RTP_MINE with the real priority.
Noticed because of incorrect nexthop selection.


# 1.250 22-Jul-2022 claudio

Retire the F_KERNEL flag, it got superseded by route priority and RTP_MINE.

Only problem is when route(8) is used to modify/delete a bgpd owned route.
Exact behaviour for that is still a bit unclear but F_KERNEL does not help
in this case either. In the kr_fib_delete/change remove F_BGPD_INSERTED
in that case as a first step.
OK tb@


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.249 20-Jul-2022 claudio

Cleanup and fix the network code.

- introduce network_free() to properly free a network struct including
the possible rtlabel reference.
- change expand_networks() and the reload code to not only expand the
main network config but also the network configs inside L3VPN sections.
- adjust reload logic to properly match any kind of network struct.
Up until now rtlabel and priority network statememnts were not correctly
reloaded.
OK tb@


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.248 23-Jun-2022 claudio

Move struct kif from bgpd.h to kroute.c

The only user of struct kif was the session engine for the 'depend on'
feature. Switch the imsg exchange to a new struct session_dependon and
rename the IMSG as well.
OK tb@


# 1.247 22-Jun-2022 claudio

Use struct kroute_full in bgpd_filternexthop() so this code becomes a lot
simpler.

OK tb@


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.246 15-Jun-2022 claudio

Rename F_BGPD_INSERTED to F_BGPD and use F_BGPD_INSERTED as a flag that
indicates that the route was successfully added to the FIB.

Filter out dynamic routes, like it is already done for ARP and ND routes) and
kill F_DYNAMIC.

Also remove the protect_lo() bits. Adding dummy kroute entries does no longer
prevent bad routes to hit the FIB. Also loopback IPs are checked in a few
other places to prevent bad routes to be installed into the FIB.

OK tb@


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.245 09-Jun-2022 claudio

Remove the rdomain / rtableid passed to some kroute functions.

kr_nexthop_add() and kr_nexthop_delete() only operate on the main table
so just pass in the right rdomain id.
kr_shutdown() and kr_dispatch_msg() don't really need the rdomain passed.
The was done for kif_remove(), since that function needs to remove connected
routes from the rdomain table. Connected routes can only exists in the
interfaces rdomain so just use kif->k.rdomain. If such routes exist that
table exists as well. If the table does not exists there are also no
connected routes to track.
OK tb@


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.244 05-Jun-2022 claudio

Rework how fib_priority is handled.
Instead of passing it around all the time put the fib_priority into the
kroute state. It is only needed in send_rtmsg() in the end.
Additionally insert F_BGP_INSERTED routes with a special RTP_MINE priority.
This makes changing the fib_priority at runtime simpler because there
is no need to alter the kroute table anymore.
OK tb@ deraadt@


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.243 02-Jun-2022 claudio

Adjust some warning messages to be a bit more accurate. ktable_update()
actually loads a routing table and not really an rdomain.


Revision tags: OPENBSD_7_1_BASE
# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.242 06-Feb-2022 claudio

Switch from u_intX_t types to stdint.h uintX_t. Mostly mechanical with
a few reindents.
OK florian@ tb@


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.241 23-Jan-2022 claudio

On poll() failure we want to skip pollfd related action but the signal
delivery checks at the end still need to happen. So that on EINTR bgpd
processes reconfigure or mrt files ASAP.
Fix for mrt integration tests.
Reported by and ok anton@


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.240 20-Jan-2022 claudio

Make sure that all poll loops properly restart the poll loop on EINTR.
Also either fail hard or restart after other errors. In anycase do not
look at pollfds after an error.
OK benno@


Revision tags: OPENBSD_7_0_BASE
# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.239 20-Jul-2021 claudio

Add -V to usage. Reported by Pier Carlo Chiodi.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.238 16-Jun-2021 job

Add command line option to show the version

OK claudio@


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.237 17-May-2021 claudio

Limit the number of concurrent RTR connects to 32.
If the limit is hit the request will be dropped and the rtr process will
retry the connect after the retry timeout. Hopefully by then the number of
connections is down again.
OK deraadt@ benno@


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.236 11-May-2021 claudio

Use non-blocking connect() to setup the RTR socket. connect() can hang for
a long time if the IP is not reachable and would block the main process
while doing so.
Problem noticed by Pier Carlo Chiodi
OK benno@


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.235 03-May-2021 claudio

Like in the session engine do not inline the addr2sa call into connect and
bind. The len argument is modified by addr2sa but is also used as argument
in the call and it is undefined if the value of len in connect is set to
the value "returned" by addr2sa().
Should fix connect issues seen on Linux system.
OK denis@


Revision tags: OPENBSD_6_9_BASE
# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.234 16-Feb-2021 claudio

Add RTR support to OpenBGPD. Add basic support for the protocol.
The RTR client runs in a new process where the protocol handling is done
and when new data is available all sources are merged into one ROA set
which is then loaded into the RDE. The roa-set from the config is also
handled by the new RTR engine.
Tested by and ok job@


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.233 04-Jan-2021 claudio

Rename PFD_PIPE_ROUTE to PFD_PIPE_RDE which is a more obvious name.
Also change the startup code to use enum bgpd_process to select which
process needs to be run. Makes the code in my opinion easier to understand.
OK denis@


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.232 30-Dec-2020 claudio

RB_REMOVE from the correct tree. Dumb copy paste bug introduced by last commit.
Noticed by procter@


# 1.231 29-Dec-2020 claudio

In preparation for RTR support change the representation of the roa-set
in the parent to a simple RB tree based on struct roa. With this overlapping
ROAs (same prefix & source-as but different maxlen) are now merged in the RDE
when the lookup trie is constructed.
OK benno@


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.230 05-Nov-2020 claudio

Kill bgpd_process, nothing uses it anymore. Instead pass the process
type directly to log init. One less common in bgpd.
OK benno@


Revision tags: OPENBSD_6_8_BASE
# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.229 11-May-2020 claudio

There is no reason to limit the string length of log_reason() to REASON_LEN
charactars. Also fix a long line.
OK benno@ deraadt@


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.228 10-May-2020 deraadt

In bgpctl argument parser, re-arrange 'reason' parsing ('nei action [reason]')
to be more generic, then change 'reload' to take take a '[reason]' also,
which will be logged by bgpd.
ok kn claudio


Revision tags: OPENBSD_6_6_BASE OPENBSD_6_7_BASE
# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.227 02-Oct-2019 claudio

In IMSG_PFKEY_RELOAD do not warn about unknown peers. When a peer is
removed the session engine will issue a IMSG_PFKEY_RELOAD call after
the parent has removed the peer which is no problem and so no need
to fill the log with this.
OK benno@


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.226 01-Oct-2019 claudio

For portable kr_init() returns an fd of -1 which now would end up in an
immediate exit of bgpd. Instead pass the fd via pointer arg.
OK benno@


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.225 08-Aug-2019 claudio

Parse the config file early on startup before bgpd is daemonized.
This way config errors will be directly user visible on startup.
To do this split out send_config() out of reconfigure() which is
sending the config to the SE and RDE.
OK sthen@


# 1.224 05-Aug-2019 claudio

Cleanup config reload in the RDE. Use the bgpd_conf struct to store sets
and l3vpns instead of temporary globals. Also rework rde_reload_done to
free filters and sets earlier. The soft-reconfiguration process no longer
needs the previous filters / sets to do its work since there is a full
Adj-RIB-Out.
OK benno@


# 1.223 05-Aug-2019 claudio

Change the order how filtersets are passed during config reloads. Instead
of sending them after e.g. the filter rule send them before. The benefit
is that the filterset is present when a rule is added and so the filter
rule is complete at that moment.
OK benno@


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.222 24-Jul-2019 benno

mrt.h only needs to be included by mrt.c
ok claudio@


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.221 23-Jul-2019 claudio

Clean up RIB related kroute code. Introduce a way to flush a FIB table
from the RDE. Make sure that all nexthops don't get removed in the FIB
when a FIB table is removed. This should only happen for the main FIB.
Remove F_RIB_HASNOFIB which is just confusing since there is already
F_RIB_NOFIB and F_RIB_NOFIBSYNC.
OK benno@


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.220 19-Jul-2019 claudio

When allocating socketpair() increase their send and receive buffers to
4 times the read size. This helps to increase the efficency of poll()
since now most of the time the read and write call can operate on full
buffers.
OK benno@ phessler@


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.219 29-May-2019 claudio

Rework pfkey handling a bit. The old remove then add way of inserting md5sig
hit a race frequently where a session ended up with no key/SPI in the kernel.
Since there is no way to do atomic updates of SADB_X_SATYPE_TCPSIGNATURE
the code is adding a new one then removing the old one.
Also make sure keys are correctly cleared when peers are deconfigured.
May not be perfect but a lot better than what was there before.
Tested by and OK sthen@


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.218 27-May-2019 claudio

Switch the peer TAILQ to a RB tree indexed by the peer id. This way
getpeerbyid() gets a lot quicker at finding the peer when many peers
are configured. In my test case the difference is around 20% runtime.
OK denis@


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.217 08-May-2019 claudio

when passing objects to imsg use the for 'obj, sizeof(*obj)' instead of
'obj, sizeof(struct object)'
OK benno@


# 1.216 08-May-2019 claudio

Rework the TCP md5sig and IKE handling. Move the pfkey socket to the parent
process in this process. The refreshing of the keys is done whenever the
session state is changes to state IDLE or ACTIVE. This should behave better
when reloading configs with auth changes.
OK benno@


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


Revision tags: OPENBSD_6_5_BASE
# 1.215 31-Mar-2019 claudio

Move the struct peer into bgpd_config and switch it to a TAILQ instead of
the hand-rolled list. This changes the way peers are reloaded since now
both parent and session engine are now merging the lists.
OK denis@


# 1.214 31-Mar-2019 yasuoka

Avoid calling dup2(oldd, newd) when oldd == newd. In that case the
descriptor keeps CLOEXEC flag then it will be closed unexpectedly by
exec().

ok tedu florian


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.213 07-Mar-2019 claudio

Do a better job at cleaning up the config on shutdown. Remove bits that
were missed before (e.g. network related objects). This helps to detect
memory leaks.
Start using new_config() and free_config() in all places where bgpd_config
structure are used. This way the struct is properly initialised and cleaned
up. Introduce copy_config() to only copy the values into the other struct
leaving the pointers as they were.
Looks good to benno@


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.212 14-Feb-2019 claudio

mrt_timeout() can send out imsgs so better call it before doing the set_pollfd
this ensures that the imsgs go actually out right away.


# 1.211 14-Feb-2019 claudio

mrt_timeout should just return -1 when there is no timeout set instead
of some strange maximum. The poll loop in bgpd.c already limits the
maximum wait time so there is no need to double it.
While there switch to using time_t for the calculation.
OK phessler@


# 1.210 14-Feb-2019 claudio

Revert part of last commit, this stuff is unrelated.


# 1.209 14-Feb-2019 claudio

Use -1 instead of the less portable INFTIM for the poll timeout.
Result is the same.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.208 11-Feb-2019 claudio

The definition of VPNs in bgpd was never super elegant. The 'depend on
mpeX' config was a bit redundant. Also to make it more flexible (e.g. having
more than one mpeX interface per rdomain the syntax was changed.

To make this possible especially the network distribution logic had to be
adjusted and cleaned up. This should in general make network statements
well defined and conflicts between 'network A.B.C.D/N' and e.g. 'network static'
are handled in a well defined way ('network A.B.C.D/N' has preference).

With and OK dlg@, OK denis@


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.207 20-Jan-2019 bcook

explicitly check if the session engine exited by comparing the pid

ok claudio@


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.206 18-Jan-2019 claudio

Don't store the mpe information in struct ktable but instead pass the
ifindex from the RDE over. This will allow to import prefixes to multiple
mpe interfaces in one rdomain.
OK dlg@


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.205 27-Dec-2018 remi

Check if a control socket or address is already in use befor using it.
If it is used abort startup or let a reload fail.
Sockets are now not unlinked anymore on regular shutdown.

This helps a lot when one tries to do a config check without -n.

Inputs and OK claudio@


Revision tags: OPENBSD_6_4_BASE
# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.204 29-Sep-2018 claudio

Implement origin validation in bgpd. This introduces two new tables, the
roa-set for RPKI based origin validation and a origin-set which allows to
lookup a source-as / prefix pair.
For RPKI a config can be built like this:
roa-set {
165.254.255.0/24 source-as 15562
193.0.0.0/21 maxlen 24 source-as 3333
}
deny from any ovs invalid
match from any ovs valid set community local-as:42
match from any ovs not-found set community local-as:43
Origin sets are similar but only match when the source-as / prefix pair is
valid.
match from any origin-set ARINDB set community local-as:44
Committing this now so that further work can be done in tree.
OK benno@, job@


# 1.203 29-Sep-2018 claudio

With the introduction of sets the config that is shipped to the RDE got
potentially much bigger. In bad cases the SE activated the config way
before the RDE which is not ideal. Introduce IMSG_RECONF_DRAIN which
acts as a barrier and ensures that both childs got all the config.
Only after that the IMSG_RECONF_DONE message is sent activating
the config in the childs more or less simultaneous.
OK benno@


# 1.202 25-Sep-2018 claudio

When sending set_tables in the imsg use the right size. Currently the
number of elements is used as size which is always wrong.


# 1.201 21-Sep-2018 claudio

Implement code to parse, print and reload roa-set tables.
This is sharing a lot of code with prefixset which makes all a bit easier.
A roa-set is defined like this:
roa-set "test2" {
1.2.3.0/24 source-as 1,
1.2.8.0/22 maxlen 24 source-as 3
}
No support for acting on this data yet.
Put it in deraadt@, OK benno@, input and OK denis@


# 1.200 20-Sep-2018 claudio

Split up as_set into a set_table and an as_set. The first is what does
the lookup and will now also be used in roa-set tries. The as_set is glue
to add the name and dirty flag. Add an accessor to get the set data so
that the imsg sending and printing can be moved into the right places.
This is done mainly because roa-sets need similar but slightly different
versions and making the code more generic is the best way fixing this.
OK benno@


# 1.199 20-Sep-2018 claudio

Switch prefixset to an RB_TREE instead of a SIMPLEQ. This allows to trigger
on duplicates (which are only reported) but is needed as a preparation step
for roa-sets.
OK benno@ denis@


# 1.198 09-Sep-2018 benno

Add network prefix-set <name> syntax to announce networks in a prefix-set.
feature discussed with deraadt@ and job@, ok claudio@


# 1.197 07-Sep-2018 claudio

Some space fixes mentioned by benno@


# 1.196 07-Sep-2018 claudio

Implement a fast presix-set lookup. This magic trie is able to match a
prefix addr/plen to a prefix-set spec addr/plen prefixlen min - max (a
prefix including prefixlen range). Every addr/plen pair is a node in the
trie and the prefixlen is added as a bitmask to those nodes.
For the lookup the any match is OK, there is no need to do longest or
best prefix matching.
Inspiration for this solution comes from the way bird implements this
which was done by Ondrej Zajicek santiago (at) crfreenet.org
OK benno@


# 1.195 07-Sep-2018 claudio

Implement as-set a fast lookup table to be used instead of long list of
AS numbers in source-as, AS and transit-as filterstatements. These table
use bsearch to quickly verify if an AS is in the set or not.
The filter syntax is not fully set in stone yet.
OK denis@ benno@ and previously OK deraadt@


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.194 14-Jul-2018 benno

get rid of two more implicit ktable_get with rdomain 0.
should not change anything when run in rdomain 0.
ok henning@ phessler@ claudio@


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.193 10-Jul-2018 benno

You can run multiple copies of bgpd in seperate rdomains.

However, the processes will see each others route messages. Some
structures are not initialized correctly for that, causing at least
useless log messages.

This is an attempt to use the default_tableid where its needed.

A few hardcoded uses of rtable 0 remain.

ok claudio@


Revision tags: OPENBSD_6_3_BASE
# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


# 1.192 10-Feb-2018 benno

Add prefix-sets, lists of prefixes which can be used in place of a
prefix in a filter rule. Initial idea hashed out with job@ in Toronto.
This is WIP, i'm commiting it now so we can work on it in the tree.
ok florian@ claudio@


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.


Revision tags: OPENBSD_6_2_BASE
# 1.191 12-Aug-2017 florian

Make not yet implemented pledges more visible in grep output.
input benno, deraadt, tedu
also standardize on #if 0 since it makes tedu's editor vomit.
OK benno, pirofti on a previous version


# 1.190 27-Jun-2017 deraadt

move a global into local context; from rob pierce


# 1.189 28-May-2017 henning

so far, bgpd was hardcoded to use rtable 0 for nexthop verification.
instead, use the rtable bgpd was started in (route -T <n> exec / rc.d
daemon_rtable) for nexthop verification and as default Adj-RIB-In and
Loc-RIB. This allows multiple bgpds in different rdomains on the same
machine - bgp router virtualization if you like buzzwords.
initial version written under contract more than a year ago, it took us
a while to wrap our brains around the bgpd <-> rdomain interactions -
1) RIBs, 2) nexthop verification and 3) tcp sockets.
ok & input phessler claudio benno


Revision tags: OPENBSD_6_1_BASE
# 1.188 24-Jan-2017 benno

sync log.c from relayd et al to bgpd.

there is still a little difference regarding handling of the verbosity
value that will be handled later.

ok claudio@ florian@


# 1.187 03-Sep-2016 renato

Simplify shutdown process.

On shutdown, there's no need to use kill(2) to kill the child
processes. Just closing the IPC sockets will make the children receive
an EOF, break out from the event loop and then exit.

Tha advantages of this "pipe teardown" are:
* simpler code;
* no need to pledge "proc" in the parent process;
* removal of a (hard to trigger) PID reuse race condition.

ok benno@ claudio@


# 1.186 02-Sep-2016 benno

work on making log.c similar in all daemons:

move daemon-local functions into new logmsg.c, and reduce
the (mostly whitespace) differences so that log.c's can be diffed easily.

ok claudio@, feedback from henning@, deraadt@, reyk@


Revision tags: OPENBSD_6_0_BASE
# 1.185 20-Jun-2016 benno

change the "nexthop 1.2.3.4 now valid: via 192.168.0.1" message to log_debug()
ok deraadt@ florian@ stsp@ phessler@


Revision tags: OPENBSD_5_9_BASE
# 1.184 02-Feb-2016 sthen

Remove setproctitle() for the parent process. Because rc.d(8) uses process
titles (including flags) to distinguish between daemons, this makes it
possible to manage multiple copies of a daemon using the normal infrastructure
by symlinking rc.d scripts to a new name. ok jung@ ajacoutot@, smtpd ok gilles@


# 1.183 27-Nov-2015 claudio

Imporve error messages for the imsg handler code. OK sthen@


# 1.182 20-Nov-2015 florian

bgpd has been naughty. It tries to play with AF_UNIX sockets without
pledging "unix".
Move control_listen up to the main process which already has
pledge("unix"). accept(2) was already allowed.

(Technically no longer necessary since listen(2) is now allowed, too,
but this moves it to the right place.)

OK claudio@, deraadt@


# 1.181 17-Nov-2015 benno

pledge() esposes a design issue in bgpd that will take a moment to
get right, so disable the pledge() call in bgpd (parent process) for now.
ok deraadt@


# 1.180 12-Nov-2015 benno

pledge the bgpd main process. Some of the promises can be improved upon
with a bit of rework, so comment why they are needed.
ok deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.179 04-Aug-2015 phessler

Give more precise errors, to help track when bgpd quits

OK florian@ benno@ sthen@ deraadt@


# 1.178 20-Jul-2015 claudio

Make bgpd execute the RDE and session engine process instead of just forking.
This way ASLR and stack cookies are per process.
With input from benno@ and deraadt@
OK benno@


# 1.177 16-Jul-2015 claudio

Next round of config cleanup. Move various lists into the bgpd_config struct.
This is the next step to better split parsing and merging the config.
OK benno@


# 1.176 14-Mar-2015 claudio

Move the command line options (mainly -d and -v) out of struct bgpd_config
into a own flag field since these can't be modified via a config reload.
OK henning@ benno@ before lock


Revision tags: OPENBSD_5_7_BASE
# 1.175 09-Feb-2015 claudio

Kill session_socket_blockmode() and replace it with SOCK_CLOEXEC or
SOCK_NONBLOCK and accept4(). OK henning@ tested & OK benno@


Revision tags: OPENBSD_5_5_BASE OPENBSD_5_6_BASE
# 1.174 13-Nov-2013 benno

from claudio
"Let msgbuf_write return -1 with errno EAGAIN. The users then must
check if this was the case and readd the event or poll again. The
current handling in the imsg code is wrong for sure."

ok gilles, benno


# 1.173 13-Nov-2013 florian

Knob to set priority with which bgpd inserts routes into the kernel
routing table. Need for it in "special" setups pointed out by
Loic Blot (loic.blot _AT_ unix-experience _DOT_ fr) on tech.
OK benno, henning


Revision tags: OPENBSD_5_4_BASE
# 1.172 31-May-2013 claudio

Unfuck BGP MPLS VPNs that got broken by the last few reload related commits.
OK henning@


# 1.171 07-Mar-2013 claudio

Implements a few missing bits for better templates support:
- on config reload also adjust the cloned neighbors so that they get the
config changes as well.
- clean up sessions that are 1h idle but in state active (instead of down)
- add bits to allow bgpctl to destroy cloned neighbors
Tested by sthen@ some time ago, OK phessler@


Revision tags: OPENBSD_5_3_BASE
# 1.170 02-Nov-2012 florian

Unstick bgpctl reload after reloading a bgpd.conf with errors.

ok claudio, benno


# 1.169 18-Sep-2012 claudio

Only allow one reload request at a time in bgpd. Needed for further work.
OK sthen@, benno@, henning@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.168 20-Aug-2011 sthen

Decouple log_verbose() from log_init() so the verbose flag stays set with
"-v" (previously only "-vd" worked). Similar to recent ospfd commit.
ok claudio@


Revision tags: OPENBSD_5_0_BASE
# 1.167 01-May-2011 claudio

Free cname and rcname on exit. Found by Milosz Jakubowski


Revision tags: OPENBSD_4_9_BASE
# 1.166 02-Sep-2010 sobrado

remove trailing spaces and tabs from source code; no binary changes
(verified by both sthen@ and me).

ok sthen@; "just commit it" claudio@


Revision tags: OPENBSD_4_8_BASE
# 1.165 28-Jun-2010 sobrado

remove -r and -s from usage, these options were dropped on a previous
change to bgpd; while here, rewrite usage() in a more usual way.

ok jmc@


# 1.164 27-Jun-2010 claudio

Instead of specifying the control sockets on the command line have them
in bgpd.conf. This allows to add/modify restricted control sockets on runtime.
Feature request by a few people how often forgot to add -r path when restarting
bgpd (including myself).
NOTE: this removes the -s and -r arguments from bgpd so pay attention when
updateing.
jajaja sthen@, OK henning@


# 1.163 19-May-2010 claudio

Add softreconfig support for peers changing the RIB. Done by first unloading
the old RIB and then via softreconfig in and a special softreconfig out loading
the new RIB.
Feature requested and testeded by Elisa Jasinska.
OK henning@


# 1.162 17-May-2010 claudio

Last bits of MPLS VPN support. Hook kernel routing tables and RIB together.
This adds a bit of new config to specify the mapping between an rdomain and
the BGP MPLS VPN instance, example:
rdomain 1 {
descr "CUSTOMER1"
rd 65003:1
import-target rt 65003:3
export-target rt 65003:1
depend on mpe0
network 192.168.224/24
}
The "depend on mpe0" is a but ugly but for now this is the quickest way to
figure out which interface bgp should use to insert the MPLS routes.

A big side-effect of this diff is that networks are now internally
distributed through kroute.c.
This needs some kernel changes that will follow hopefully soon.
OK henning@


# 1.161 03-May-2010 claudio

Make it possible to load multiple routing tables at the same time and use
those for alternate RIBs. This allows to use "rde rib TESTIT rtable 1".
NOTE: nexthop verification has changed for alternate tables. For now
nexthop will only be verified against the main routing table (id 0).
Because of this "nexthop qualify via bgp" may now compare the nexthops
against bgpd routes from a different RIB.
Tested by sthen@, OK to move on by henning@


# 1.160 26-Apr-2010 claudio

Add some { } for better readability and to make the code look like the
other blocks in this function.


# 1.159 26-Apr-2010 claudio

Fix some memory leaks on config reload failure and move one particular
cleanup loop to parse.y where it belongs.
OK henning@


# 1.158 22-Apr-2010 claudio

Including bgpd.h in mrt.h is dumb.


# 1.157 13-Apr-2010 claudio

Instead of passing AF specific struct kroutes over imsgs use a struct
kroute_full structure that is AF independent and has all information in
it. Simplifies the communication between processes and reduces the number
of imsg types. This is another step to add FIB support to BGP MPLS VPNs.


# 1.156 29-Mar-2010 claudio

Since we always reload the config now there is no need to allocate the
filter list head. It is only used temporary in reconfigure().
OK henning


Revision tags: OPENBSD_4_7_BASE
# 1.155 03-Mar-2010 claudio

Remove superfluous newline


# 1.154 11-Feb-2010 claudio

We need to load the config before kr_init() is called or fib-update is
ignored. Found and fix tested by Elisa Jasinska.


# 1.153 11-Jan-2010 claudio

Do not crash when starting up with a bad config file. Check that
conf.listen_addr is actually valid before deref.


# 1.152 31-Dec-2009 claudio

Instead of passing the config via arguments to the childs on bootup issue
a config reload as first step in bootup. This allows childs to start with
an empty config and a lot of special cleanup code can bite the dust.
Testing by myself and sthen@ with a few configs (more testing welcome).
Seems like a good idea henning@ & sthen@


# 1.151 01-Dec-2009 claudio

Use an artificial address family id in struct bgpd_addr and almost everywhere
else. Adds conversion functions to map AFI/SAFI and the Unix AF_ values
from and into AID used in bgpd. This is needed to support things like MPLS
VPN and other upcomming changes that need to play a lot with AFI/SAFI pairs.
Mostly mechanical change, henning@ has no particular issues with this.
Must go in so that I can continue working.


# 1.150 02-Nov-2009 claudio

Implement IMSG_CTL_LOG_VERBOSE similar to ospfd. Even though bgpd has almost
no log_debug() it makes more sense to make all routing daemons behave the same.


# 1.149 20-Jul-2009 claudio

On config reload errors free the list of ribs so that following reloads
don't fail because of redefinition conflicts. This problem was reported
by some people.


Revision tags: OPENBSD_4_6_BASE
# 1.148 07-Jun-2009 claudio

First attempt at reload support for RIBs. There is some magic that I do
not fully understand but at least no flames are comming out of my test
box anymore.


# 1.147 05-Jun-2009 claudio

Adjust print_config to all the stuff added in the last days.


# 1.146 04-Jun-2009 claudio

Add "rde rib <name>" to the config and allow the rde to use these other RIBs.
Still a bit hackish, reload is missing and printconf as well. Looks good h@


Revision tags: OPENBSD_4_4_BASE OPENBSD_4_5_BASE
# 1.145 12-May-2008 pyr

Error out with usage line if additional arguments are given after the
option parsing. Found out the hard way by jdixon on ifstated.

ok sobrado@, jdixon@, millert@


Revision tags: OPENBSD_4_2_BASE OPENBSD_4_3_BASE
# 1.144 11-May-2007 claudio

Various spelling fixes from Stuart Henderson.


Revision tags: OPENBSD_4_1_BASE
# 1.143 26-Jan-2007 claudio

Massiv rework of the control imsg flow. Main changes:
- dedicated pipe between the SE and the RDE for control messages
- restartable RB tree dumps in the RDE
- queuing limits both in the SE and RDE
The result is a dramatic decrease of memory consumption on operations like
bgpctl show rib. Previously all messages where first stored in the RDE
then passed to the SE where they got queued in case bgpctl was not fast enough.
Now only a small number of messages is generated and passed to the SE and
the SE has an additional limit instead of acting like an infinite buffer.
Without this the bgpd on bgpd.networx.ch would not survive a single minute.
looks good henning@


# 1.142 04-Jan-2007 henning

ignore SIGPIPE, like the other 2 processes already do. we detect broken
pipes without the signal just fine. ok claudio


# 1.141 04-Jan-2007 claudio

Do not run rde_shutdown() unless bgpd is started with -d.
On some of my systems rde_shutdown() takes more than 3min doing nothing more
than calling free(3) over and over again.


# 1.140 28-Nov-2006 henning

allow bgpd to work on alternate routing tables, claudio ok, jmc manpage help


Revision tags: OPENBSD_4_0_BASE
# 1.139 19-Jun-2006 jmc

add -c to usage() and synopsis;


# 1.138 17-Jun-2006 henning

implement carp demotion control for bgpd.
sessions can be configured to modify the carp demotion counter for a
given interface group (usually, "carp", which has all carp interfaces)
when the session is not established. once the session is established for
60 seconds, the demotion is cleared.
this, used correctly, can prevent a bgpd-box which lost all sessions (and
thus has no routes) to be carp master, while the backup has sessions.
thought through and partially hacked on a drive from calgary to vancouver
with ryan, ok claudio


# 1.137 27-May-2006 claudio

Pass a IMSG_CTL_RESULT messgae back to bgpctl on reloads to indicate if
the reload was successful or not. OK henning@


# 1.136 26-Apr-2006 claudio

Last argument to send_filterset() is a left-over from one of my not so clever
ideas that will never be included and always set to 0. Kill it.


# 1.135 22-Mar-2006 claudio

Change the way bgpd selects nexthops. Up until now every route was considered
when calculating the nexthop. Now only non BGP routes and not the default
route are used unless forced with the new config options
nexthop qualify via bgp
nexthop qualify via default
This change is required for complex setups e.g. where an additional IGP is
running. OK henning@


# 1.134 15-Mar-2006 claudio

Sync usage with man page (sort arguments).


# 1.133 15-Mar-2006 claudio

Allow the control socket to be changed on the command line. Useful if you
need to run multiple bgpds on a single box to simulate a IX. This helped
me massivly debugging error reports. OK henning@


Revision tags: OPENBSD_3_9_BASE
# 1.132 24-Jan-2006 claudio

Functions in the poll() loop should only be moved around if there are no
side-effects. Revert last changes and make bgpctl reload work again.


# 1.131 24-Jan-2006 henning

KNF


# 1.130 24-Jan-2006 henning

introduce a second control socket, which is restricted to certain messages,
nameley the show ones. needed for looking glass style applications,
monitoring etc. claudio ok


# 1.129 03-Jan-2006 claudio

Plug some mem leaks.


# 1.128 03-Jan-2006 claudio

Move the signal handler flags check between the poll() call and the poll
fd handling. Do not access poll fd in case of an error or timeout.
With and OK dlg@


# 1.127 24-Dec-2005 claudio

bzero the pfd array before setting it up and calling poll because on error
(e.g. EINTR) poll() will not update the pfd array (copyout) and so the old
revents are used and results in a blocking parent process. OK dlg@


# 1.126 02-Nov-2005 claudio

Reorder and comment reconfigure(). Makes more sense so.


# 1.125 01-Nov-2005 claudio

Switch from the per peer filter set list to a filter-only solution.
The default filter_sets are converted into match filter rules that get
evaluated first. Simplifies code massively -- mainly the config reload
part -- and makes softreconfig out a piece of cake. "get it in" henning@


# 1.124 13-Oct-2005 claudio

Simplify poll loop as well. "grrr, OK" henning@


Revision tags: OPENBSD_3_8_BASE
# 1.123 01-Jul-2005 claudio

Switch filter_sets form SIMPLEQ to TAILQ, needed for upcomming stuff.


# 1.122 29-Jun-2005 claudio

rtlabel support via filter sets. Just use "set rtlabel foobar" in filters
network and neighbor statements and the routes are labeled accordingly.
While doing that fix some mem-leaks by introducing filterset_free() and
remove the free on send option of send_filterset().
This took a bit longer because we need to carefully track the rtlabel id
refcnts or bad things may happen on reloads.
henning@ looks fine


# 1.121 09-Jun-2005 claudio

Change the "network connected|static" statements to "network inet|inet6
connected|static" so that it is possible to distinguish between IPv4 and IPv6
addresses. "network connected|static" is considered deprecated but will be
supported as an alias for "network inet connected|static" for some time (one
release) to simplify upgrades. This also solve a nasty crash when using
"network connected". OK henning@


# 1.120 27-May-2005 henning

will throw claudio in a big pot of kaesefondue for repeated whitespace fuckups


# 1.119 27-May-2005 claudio

kroute6 support, at least partially. Get it in so that Henning can clean it
up more. OK henning@


# 1.118 23-May-2005 henning

one more endpwent


# 1.117 28-Apr-2005 claudio

Support for "network connected" and "network static" -- announce all
directly connected respectively all static routes. The list is auto-
matically adjusted as soon as a route changes.
OK henning@


# 1.116 30-Mar-2005 henning

bgpd used to open listeners in advance in the parent and the SE picked
those it needed, closing all the others. this has some nasty races.
so let the parent keep the list of listeners so it knows when it has
to open a new one
claudio ok, also tested by jason ackley


# 1.115 28-Mar-2005 henning

free rules_l if the initial config file parse fails


# 1.114 24-Mar-2005 tedu

fix memory leak in error paths. found with coverity prevent.
ok claudio henning


Revision tags: OPENBSD_3_7_BASE
# 1.113 09-Feb-2005 henning

need to send IMSG_NETWORK_DONE after sending networks and associated filter
sets, otherwise local netyworks get withdrawn after config reload;
misbehaviour noticed by peter.galbavy@knowtion.net, claudio ok


# 1.112 02-Feb-2005 henning

usage() is __dead
pt out by Alexander v Gernler


# 1.111 23-Nov-2004 claudio

Switch from a single filter_set to a linked list of sets. With this change
it is possible to specify multiple communities. This is also the first step
to better bgpd filters. OK henning@


# 1.110 19-Oct-2004 henning

allow neighbor definitions to depend on interface state.
with this, if a neighbor is configured as dependent on carp0 for example,
the neighbor will remain in state IDLE as long as carp0 is not master.
once carp0 becomes master the session(s) depending on it immediately
go to CONNECT (or ACTIVE, if they're configured passive), reducing failover
time. claudio ok, with some input from ryan as well


# 1.109 23-Sep-2004 henning

after receival of a SIGCHLD reset io_pid or rde_pid, respectively, dependent
on which child went away.


# 1.108 16-Sep-2004 henning

imsg API cleanup:
-kill imsg_compose_pid, imsg_compose_fdpass and imsg_create_pid
-extend the original imsg_compose/_create API to take pid & fd too
-make imsg_compose do imsg_create + imsg_add + imsg_close instead of
duplicating the code
-adjust all callers to the new API
ok claudio


# 1.107 16-Sep-2004 henning

malloc the imsg buffers instead of having them staticly, suggested by
micskye some time ago


# 1.106 15-Sep-2004 otto

if (signalflag) { dowork(); signalflag = 0; } is a race. First clear flag,
then call work(). ok henning@


Revision tags: OPENBSD_3_6_BASE
# 1.105 24-Aug-2004 henning

use session_socket_blockmode() instead of hand-rolling roughly the same
claudio ok


# 1.104 05-Aug-2004 claudio

The peer_l is not needed in the rde but still allocated, free them and
save 1k per peer. OK henning@


# 1.103 03-Aug-2004 claudio

Fix mem-leak on exit. OK henning@


# 1.102 28-Jul-2004 claudio

The hole dance to close a mrt file after fd passing in the parent is not
needed as the fd is closed while beeing passed. looks good henning@


# 1.101 05-Jul-2004 henning

fix a few KNF fallouts


# 1.100 04-Jul-2004 henning

2 more file descriptors for each RDE and SE inherited from the parent
we should close


# 1.99 04-Jul-2004 henning

when getting rid of the listen_addr TAILQ after forking actually close
the file descriptors in RDE and parent process, not needed or used there


# 1.98 03-Jul-2004 claudio

Switch mrt dumping to fd passing. This gives some speed up when extensive
dumping is done. Acctually mrt dumps were broken because of the fd passing.
The nice side effect is a much cleaner code, especially in the parent process.
OK henning@


# 1.97 20-Jun-2004 henning

at least somewhat consistently name the TAILQ_ENTRYs... this confused me
more than once


# 1.96 20-Jun-2004 henning

implement file descriptor passing in the imsg/msgbuf framework, and use
it to let the main process to prepare new listening sockets (socket() and
bind()) on behalf of the session engine, which of course cannot bind() to
ports < 1024 any more once it dropped privileges. with some help from theo,
claudio ok


# 1.95 06-Jun-2004 henning

rework bgpd's handling of listening sockets. instead of one for each
supported address familiy, keep a tailq of an arbitary number of them.
the new struct listen_addr contains the sockaddr and the fd.
this fixes quite some nasty behaviour which was a consequence of the previous
model.
looks right deraadt@, and discussed with claudio


# 1.94 21-May-2004 claudio

Add support for dynamic announcements. Usefule to annouce temporary
blackhole routes or to make network announcements dependent on a external
state (e.g. for carp setups) OK henning@


# 1.93 07-May-2004 djm

add a filter option to dump prefixes learned in UPDATEs into a PF table,
intended for building realtime BGP blacklists (e.g. with spamd);
ok claudio & henning


# 1.92 03-May-2004 henning

little KNF issue


# 1.91 29-Apr-2004 deraadt

sock -> fd; ok henning


# 1.90 27-Apr-2004 deraadt

crud stripping; henning ok


# 1.89 25-Apr-2004 claudio

Remove the no longer needed configure stuff in RDE. The peer list needs no
longer to be synced between parent, SE and RDE. OK henning@


Revision tags: OPENBSD_3_5_BASE
# 1.88 16-Mar-2004 henning

delay creating the control socket until after forking, but before chroot
(lives in /var/run, i. e. outside chroot) and privdrop.
claudio ok


# 1.87 12-Mar-2004 henning

fix the "wait for child processes to terminate" code, and move it down a bit
millert ok


# 1.86 11-Mar-2004 claudio

Shutdown the RDE cleanly on exit. Plug some memleaks. OK henning@


# 1.85 11-Mar-2004 claudio

Free unneeded mrt lists in SE and on exit. OK henning@


# 1.84 10-Mar-2004 henning

pass a pointer to the network list as well to session_main so we can free()
the members after fork


# 1.83 10-Mar-2004 henning

pass a pointer to the filter rule list to session_main() so we can free()
the list entries and the head there after forking


# 1.82 10-Mar-2004 henning

free peer list on exit, claudio ok


# 1.81 01-Mar-2004 claudio

Arrrg. Not my day. Sync printconfig with parser here too.


# 1.80 19-Feb-2004 claudio

Make the code more portable. Add some missing header files and make the use
of the queue(3) makros more portable. OK henning@ some time ago.


# 1.79 09-Feb-2004 henning

print networks too


# 1.78 09-Feb-2004 henning

drain the list fluffier


# 1.77 09-Feb-2004 henning

print more fluff


# 1.76 09-Feb-2004 henning

move printing the config to where it belongs


# 1.75 07-Feb-2004 henning

send filter rules to the RDE on reloads, help & ok claudio


# 1.74 06-Feb-2004 henning

initial cut at the filtering language.
structs etc to describe a rule, filter rule list management
parser groks filter defs now.

claudio ok, discussion & help also jakob theo


# 1.73 03-Feb-2004 henning

defer free()ing the previous peer list until after parsing the config file
so in the parser we can access it. will be needed soon.


# 1.72 23-Jan-2004 henning

use log_addr


# 1.71 22-Jan-2004 henning

use log_warnx and log_info. reclassify a few messages in the process and fix
a few messages.

ok claudio@


# 1.70 22-Jan-2004 henning

s/log_err/log_warn/
it is like warn(3), nor err(3). so use a less confusing name.


# 1.69 20-Jan-2004 henning

check early wether user _bgpd exists so we can bail out early and nicely
prodded by theo


# 1.68 17-Jan-2004 claudio

Make it possible to announce own networks. In the RDE these prefixes are
attached to a pseudo peer and inserted like all other prefixes into the RIB.
OK henning@


# 1.67 17-Jan-2004 henning

allow the interfaces as bgpd sees 'em to be queried via imsgs


# 1.66 11-Jan-2004 henning

use bgpd_addr in the nexthop tree; change nexthop_add/_remove accordingly

ok claudio@


# 1.65 11-Jan-2004 henning

in the nexthop imsgs use struct bgpd_addr for the data part instead of
in_addr_t

ok claudio@


# 1.64 11-Jan-2004 henning

use struct bgpd_addr for nexthop and gateway in struct kroute_nexthop
(and thus the nexthop messages between parent and RDE)

ok claudio@


# 1.63 11-Jan-2004 henning

new message IMSG_CTL_SHOW_NEXTHOP: request/send lost of BGP nexthops and
the result of their validity check


# 1.62 11-Jan-2004 claudio

The buffer changes produced some fallout in the mrt code.
Wait until all data has been written out before closing the file and fix
some obvious misstakes. OK henning@


# 1.61 09-Jan-2004 henning

for IMSG_CTL_KROUTEs allow matching based on flags,
add IMGS_CTL_KROUTE_ADDR to match the route for a given address

ok claudio@


# 1.60 09-Jan-2004 henning

get us a stateful imsg relaying framework, and the first receiver,
IMSG_CTL_KROUTE, to have the kroute structs forming the fib sent to a
control socket.

ok claudio@


# 1.59 08-Jan-2004 henning

rename a few functions to further clarify things


# 1.58 06-Jan-2004 henning

2004


# 1.57 05-Jan-2004 claudio

Big overhaul of the mrt code.
Dumping of incomming bgp messages is now possible and dumping the (not yet)
filtered updates works too. Per neighbor dumps are still missing.
OK henning@


# 1.56 05-Jan-2004 henning

correctly handle SIGCHLD.
SIGCHLD does _not_ translate to "a child process went kaboom".
waitpid() and check status; if the child exited or terminated log & quit

ok claudio@


# 1.55 05-Jan-2004 henning

waitpid's return is a pid_t


# 1.54 05-Jan-2004 henning

allow fib couple/decouple based on an imsg received on the control socket
by the SE and passed on to the main process


# 1.53 04-Jan-2004 henning

-new imsg CTL_RELOAD
-upong receival in the SE forward to parent
-make sending messages from SE to parent work for that (was not required before)
-parent reacts to that just like a SIGHUP, reread config file


# 1.52 03-Jan-2004 henning

move some session specific stuff to session.h and make the few files
that need it include that


# 1.51 03-Jan-2004 henning

decouple the peer list from bgpd_config.
so many parts of bgpd are not at all interested in the session specific peer
structs... allows for some further cleaning


# 1.50 03-Jan-2004 henning

change imsg_read semantics so that the number of bytes read is returned.
that means that the callers can (and must) coope with closed connections
themselves, what is exactly the desired behaviour.


# 1.49 01-Jan-2004 henning

listen on a AF_LOCAL socket for imsgs too.
only implemented type yet is IMSG_CTL_SHOW_NEIGHBOR which sends back
the struct peer for all neighbors.
will be used by bgpdctl


# 1.48 01-Jan-2004 henning

now that imsg_get uses bigger buffers, one read call can put more than one
imsg into the buffer. since imsg_get by definition only returns one imsg we
missed the next imsg(s) until the next poll event on the socket in question,
building up a queue on that socket. didn't show up as a problem yet...
factor out imsg_read, which reads into the buffer. imsg_get now entirely
operates on the buffers and does not read(2) itself.
make all callers cope by calling imsg_read on poll events and calling
imsg_get in a loop until all imsgs are processed.


# 1.47 30-Dec-2003 henning

correctly free after buf_add/_close errs.
From: Patrick Latifi <pat@eyeo.org>


# 1.46 30-Dec-2003 henning

missing free()s in error cases that (now) lead to program termination
From: Patrick Latifi <pat@eyeo.org>


# 1.45 27-Dec-2003 henning

move the fib couple/decouple to the config merge where it belongs


# 1.44 27-Dec-2003 henning

keep a copy of the fd locally instead of passing it around all time


# 1.43 27-Dec-2003 henning

on reconfigure, check wether the "no fib-update" statement presence/absence
changed.
if it is absent but was present before, call kroute_fib_couple
if it is present but was absent before, call kroute_fib_decouple


# 1.42 27-Dec-2003 henning

implement "no fib-update" much cooler


# 1.41 26-Dec-2003 henning

erm, oups, well, put back rde_pid and io_pid assignments that got lost
somehow...


# 1.40 26-Dec-2003 henning

fix logging in send_nexthop_update


# 1.39 26-Dec-2003 henning

let imsg_get and imsg_compose not fatal() but return errors upstream.
make the callers cope.


# 1.38 26-Dec-2003 henning

when this project started and i added the fatal() function, I made it take
the error number as parameter instead of accessing errno, because in one
place the error number was not in errno but fetched from a socket.
now, of course it makes much more sense to just set errno to the error number
just fecthed in this one place instead of having hundreds of fatal() calls
all transfer the errno round and round and round...
fix this, and also provide a fatalx, which does not care for errno and doesn't
invoke strerror.
oh, btw, in the place where we fetch the err # from the socket, we don't
call fatal anymore anyway...


# 1.37 26-Dec-2003 henning

by making kroute_dispatch_msg() and kroute_nexthop_add() return int instead
of void they can now report errors upstream and do not need to panic any
more. so do that and handle the errors in bgpd.c in the vein that we at least
can clean up before exit.
there are no direct fatal() call in kroute.c now any more, nor any in bgpd.c
after forking.


# 1.36 26-Dec-2003 henning

overhaul error handling
try to handle as much as possbile in a graceful way so taht we don't leave
the kernel routing table full of our routes, for example.


# 1.35 26-Dec-2003 henning

handle kroute_init failures nicer


# 1.34 26-Dec-2003 henning

improve log message


# 1.33 26-Dec-2003 henning

log nexthop status changes


# 1.32 26-Dec-2003 henning

handle IMSG_NEXTHOP_DELETE as well


# 1.31 26-Dec-2003 henning

kroute_nexthop_check -> kroute_nexthop_add
kroute_validate_nexthop -> kroute_nexthop_insert


# 1.30 26-Dec-2003 henning

finally marry rde and kroute parts of the nexthop verification:
handle IMSG_NEXTHOP_ADD and send IMSG_NEXTHOP_UPDATE when appropriate


# 1.29 25-Dec-2003 henning

track routing table changes that are _not_ caused by bgpd itself

ok claudio@


# 1.28 25-Dec-2003 henning

kill IMSG_KROUTE_ADD as well. just send KROUTE_CHANGE requests.


# 1.27 25-Dec-2003 henning

it actually makes more sense to call the merged function kroute_change


# 1.26 25-Dec-2003 henning

kroute_change is obsolete, long live kroute_add


# 1.25 24-Dec-2003 henning

now that the main process can cleanup without RDE's help, we do not need the
somewhat fragile IMSG_SHUTDOWN_* stuff any more. speeds shutdown up
enourmously.

ok claudio@


# 1.24 24-Dec-2003 henning

now that we keep track of the routes we added to the kernel we can remove
them easily on shutdown without the RDE's help


# 1.23 24-Dec-2003 henning

slightly more helpfull error msgs


# 1.22 24-Dec-2003 henning

handle write() returning 0 correctly, adjust the msgbuf API appropriately,
and make all callers cope.


# 1.21 23-Dec-2003 deraadt

spacing


# 1.20 23-Dec-2003 henning

send shutdown requests to the child processes and wait for a ackmessage from
them when shutting down.
the rde needs the main process to clean up the routing table on exit so the
parent process needs to be in service until the RDE is done.
ok claudio@


# 1.19 22-Dec-2003 henning

delay setting up the signal handlers in the main process until after fork(),
pointed out by theo


# 1.18 22-Dec-2003 henning

o add low-level functions for adding/chaining/removing kernel routes
o define new imsg types for this
o process these imsgs in the parent process

now "only" debugging and the rde sending these messages is missing.

ok claudio@


# 1.17 22-Dec-2003 henning

uid check later; configtest is useful as non-root


# 1.16 22-Dec-2003 henning

add a configtest mode


# 1.15 22-Dec-2003 miod

No need to setup a signal handler for SIGKILL as you can't catch it anyway.


# 1.14 22-Dec-2003 henning

in the same vein we can plain errx() if the geteuid check fails.


# 1.13 22-Dec-2003 henning

when starting up and the configuration has errors, do not call fatal().
plain exit() is enough. we have not yet forked and an error message is already
printed by the parser.
inspired by a theo request


# 1.12 21-Dec-2003 henning

rename get_imsg() to imsg_get(); that's more consistent


# 1.11 21-Dec-2003 henning

wrap read & write buffers for imsgs into a struct.
finally gives us read buffers per pipe instead of per process, eleminating
a possible race.
also gets us a real imsg_init() that does all the boring init work


# 1.10 21-Dec-2003 henning

overhaul the write buffering code.
introduce msgbuf API and bundle all info needed for the write buffers in a
struct msgbuf.
also switch to a write queue per handled connection (each bgp session, each
pipe) instead of one big one.
fixes some subtle problems and is overall nicer.

ok claudio@


# 1.9 21-Dec-2003 henning

use pipe(2)s instead of socketpair(2)s.
suggested by tedu@ for a performance gain, ok claudio@


# 1.8 20-Dec-2003 henning

more from the castathon; imsg_compose takes void * now so get rid of the casts


# 1.7 20-Dec-2003 henning

few missing break; in default: cases in switch; one noticed by tedu@


# 1.6 20-Dec-2003 deraadt

spacing


# 1.5 20-Dec-2003 henning

parent: waitpid() for the child processes on exit


# 1.4 20-Dec-2003 henning

keep track which process we are so fatal() can log in which proc the
condition happened. fatal()s from subsystems used by all 3 processes like
the imsg subsystem were hard to track down without knowing in which process
the condition happened.


# 1.3 20-Dec-2003 henning

read(2)/write(2) return ssize_t, not size_t


# 1.2 17-Dec-2003 henning

send reconf requests to the RDE as well and handle them there; syncing peer
data with RIB missing
use same message in RDE and SE for consistency


# 1.1 17-Dec-2003 henning

welcome, bgpd
started by me some time ago with moral support from theo, the proceeded up to
the point where the session engine worked correctly. claudio jeker joined
then and did a lot of work in the RDE.
it is not particulary usefull as application right now as parts are still
missing but is imported to enable more people to work on it.
status:
BGP sessions get established fine, OPEN messages and then KEEPALIVEs
exchanged etc. session FSM works fine; NOTIFICATIONs are handled fine, and
all connection drops etc I provoked get handled fine.
Incoming UPDATE messgages are parsed well and the data entered to the RIB,
the decision process is not yet there, neither is outgoing UPDATEs or sync
to the kernel routing table.

not connected to the builds yet.