#
1.78 |
|
10-Mar-2024 |
riastradh |
wg(4): Bind to CPU in wg_handle_packet.
Required by use of psref there.
Assert we're bound up front so we catch mistakes early, rather than later on if we get unlucky in preemption and scheduling.
PR bin/58021
|
Revision tags: thorpej-altq-separation-base
|
#
1.77 |
|
01-Aug-2023 |
mrg |
branches: 1.77.2; fix simple mis-matched function prototype and definitions.
most of these are like, eg
void foo(int[2]);
with either of these
void foo(int*) { ... } void foo(int[]) { ... }
in some cases (such as stat or utimes* calls found in our header files), we now match standard definition from opengroup.
found by GCC 12.
|
#
1.76 |
|
11-Apr-2023 |
jakllsch |
Give scope and additional details to wg(4) diagnostic messages.
|
#
1.75 |
|
05-Apr-2023 |
andvar |
s/termintaed/terminated/ in comment.
|
#
1.74 |
|
05-Jan-2023 |
christos |
centralize the kauth ugliness.
|
#
1.73 |
|
05-Jan-2023 |
jakllsch |
wg(4): Allow non-root to retrieve information other than the private key and the peer preshared key.
Add kauth(9) enums for wg(4) and add use them in suser secmodel.
Refines fix for PR 57161.
|
#
1.72 |
|
05-Jan-2023 |
jakllsch |
Check for authorization for SIOCSDRVSPEC and SIOCGDRVSPEC ioctls for wg(4).
Addresses PR 57161.
|
Revision tags: netbsd-10-base
|
#
1.71 |
|
04-Nov-2022 |
ozaki-r |
branches: 1.71.2; inpcb: rename functions to inpcb_*
Inspired by rmind-smpnet patches.
|
#
1.70 |
|
28-Oct-2022 |
ozaki-r |
Adjust pf, wg, dccp and sctp for struct inpcb integration
|
Revision tags: bouyer-sunxi-drm-base
|
#
1.69 |
|
25-Mar-2022 |
hannken |
Prevent memory corruption from wg_send_handshake_msg_init() on LP64 machines with "MSIZE == 256", sparc64 for example.
wg_send_handshake_msg_init() tries to put 148 bytes into a buffer of 144 bytes and overwrites 4 bytes following the mbuf. Check for "sizeof() > MHLEN" and use a cluster in this case.
With help from Taylor R Campbell <riastradh@>
|
#
1.68 |
|
16-Jan-2022 |
riastradh |
wg(4): Limit the size of ifdrv requests.
Avoids potential integer overflow or kernel memory exhaustion.
Reported by Thomas Leroy a while back.
|
#
1.67 |
|
31-Dec-2021 |
riastradh |
sys: Use if_init wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.66 |
|
31-Dec-2021 |
riastradh |
sys: Use if_stop wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.77 |
|
01-Aug-2023 |
mrg |
fix simple mis-matched function prototype and definitions.
most of these are like, eg
void foo(int[2]);
with either of these
void foo(int*) { ... } void foo(int[]) { ... }
in some cases (such as stat or utimes* calls found in our header files), we now match standard definition from opengroup.
found by GCC 12.
|
#
1.76 |
|
11-Apr-2023 |
jakllsch |
Give scope and additional details to wg(4) diagnostic messages.
|
#
1.75 |
|
05-Apr-2023 |
andvar |
s/termintaed/terminated/ in comment.
|
#
1.74 |
|
05-Jan-2023 |
christos |
centralize the kauth ugliness.
|
#
1.73 |
|
05-Jan-2023 |
jakllsch |
wg(4): Allow non-root to retrieve information other than the private key and the peer preshared key.
Add kauth(9) enums for wg(4) and add use them in suser secmodel.
Refines fix for PR 57161.
|
#
1.72 |
|
05-Jan-2023 |
jakllsch |
Check for authorization for SIOCSDRVSPEC and SIOCGDRVSPEC ioctls for wg(4).
Addresses PR 57161.
|
Revision tags: netbsd-10-base
|
#
1.71 |
|
04-Nov-2022 |
ozaki-r |
branches: 1.71.2; inpcb: rename functions to inpcb_*
Inspired by rmind-smpnet patches.
|
#
1.70 |
|
28-Oct-2022 |
ozaki-r |
Adjust pf, wg, dccp and sctp for struct inpcb integration
|
Revision tags: bouyer-sunxi-drm-base
|
#
1.69 |
|
25-Mar-2022 |
hannken |
Prevent memory corruption from wg_send_handshake_msg_init() on LP64 machines with "MSIZE == 256", sparc64 for example.
wg_send_handshake_msg_init() tries to put 148 bytes into a buffer of 144 bytes and overwrites 4 bytes following the mbuf. Check for "sizeof() > MHLEN" and use a cluster in this case.
With help from Taylor R Campbell <riastradh@>
|
#
1.68 |
|
16-Jan-2022 |
riastradh |
wg(4): Limit the size of ifdrv requests.
Avoids potential integer overflow or kernel memory exhaustion.
Reported by Thomas Leroy a while back.
|
#
1.67 |
|
31-Dec-2021 |
riastradh |
sys: Use if_init wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.66 |
|
31-Dec-2021 |
riastradh |
sys: Use if_stop wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.76 |
|
11-Apr-2023 |
jakllsch |
Give scope and additional details to wg(4) diagnostic messages.
|
#
1.75 |
|
05-Apr-2023 |
andvar |
s/termintaed/terminated/ in comment.
|
#
1.74 |
|
05-Jan-2023 |
christos |
centralize the kauth ugliness.
|
#
1.73 |
|
05-Jan-2023 |
jakllsch |
wg(4): Allow non-root to retrieve information other than the private key and the peer preshared key.
Add kauth(9) enums for wg(4) and add use them in suser secmodel.
Refines fix for PR 57161.
|
#
1.72 |
|
05-Jan-2023 |
jakllsch |
Check for authorization for SIOCSDRVSPEC and SIOCGDRVSPEC ioctls for wg(4).
Addresses PR 57161.
|
Revision tags: netbsd-10-base
|
#
1.71 |
|
04-Nov-2022 |
ozaki-r |
branches: 1.71.2; inpcb: rename functions to inpcb_*
Inspired by rmind-smpnet patches.
|
#
1.70 |
|
28-Oct-2022 |
ozaki-r |
Adjust pf, wg, dccp and sctp for struct inpcb integration
|
Revision tags: bouyer-sunxi-drm-base
|
#
1.69 |
|
25-Mar-2022 |
hannken |
Prevent memory corruption from wg_send_handshake_msg_init() on LP64 machines with "MSIZE == 256", sparc64 for example.
wg_send_handshake_msg_init() tries to put 148 bytes into a buffer of 144 bytes and overwrites 4 bytes following the mbuf. Check for "sizeof() > MHLEN" and use a cluster in this case.
With help from Taylor R Campbell <riastradh@>
|
#
1.68 |
|
16-Jan-2022 |
riastradh |
wg(4): Limit the size of ifdrv requests.
Avoids potential integer overflow or kernel memory exhaustion.
Reported by Thomas Leroy a while back.
|
#
1.67 |
|
31-Dec-2021 |
riastradh |
sys: Use if_init wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.66 |
|
31-Dec-2021 |
riastradh |
sys: Use if_stop wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.74 |
|
05-Jan-2023 |
christos |
centralize the kauth ugliness.
|
#
1.73 |
|
05-Jan-2023 |
jakllsch |
wg(4): Allow non-root to retrieve information other than the private key and the peer preshared key.
Add kauth(9) enums for wg(4) and add use them in suser secmodel.
Refines fix for PR 57161.
|
#
1.72 |
|
05-Jan-2023 |
jakllsch |
Check for authorization for SIOCSDRVSPEC and SIOCGDRVSPEC ioctls for wg(4).
Addresses PR 57161.
|
Revision tags: netbsd-10-base
|
#
1.71 |
|
04-Nov-2022 |
ozaki-r |
inpcb: rename functions to inpcb_*
Inspired by rmind-smpnet patches.
|
#
1.70 |
|
28-Oct-2022 |
ozaki-r |
Adjust pf, wg, dccp and sctp for struct inpcb integration
|
Revision tags: bouyer-sunxi-drm-base
|
#
1.69 |
|
25-Mar-2022 |
hannken |
Prevent memory corruption from wg_send_handshake_msg_init() on LP64 machines with "MSIZE == 256", sparc64 for example.
wg_send_handshake_msg_init() tries to put 148 bytes into a buffer of 144 bytes and overwrites 4 bytes following the mbuf. Check for "sizeof() > MHLEN" and use a cluster in this case.
With help from Taylor R Campbell <riastradh@>
|
#
1.68 |
|
16-Jan-2022 |
riastradh |
wg(4): Limit the size of ifdrv requests.
Avoids potential integer overflow or kernel memory exhaustion.
Reported by Thomas Leroy a while back.
|
#
1.67 |
|
31-Dec-2021 |
riastradh |
sys: Use if_init wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.66 |
|
31-Dec-2021 |
riastradh |
sys: Use if_stop wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.74 |
|
05-Jan-2023 |
christos |
centralize the kauth ugliness.
|
#
1.73 |
|
05-Jan-2023 |
jakllsch |
wg(4): Allow non-root to retrieve information other than the private key and the peer preshared key.
Add kauth(9) enums for wg(4) and add use them in suser secmodel.
Refines fix for PR 57161.
|
#
1.72 |
|
05-Jan-2023 |
jakllsch |
Check for authorization for SIOCSDRVSPEC and SIOCGDRVSPEC ioctls for wg(4).
Addresses PR 57161.
|
Revision tags: netbsd-10-base
|
#
1.71 |
|
04-Nov-2022 |
ozaki-r |
inpcb: rename functions to inpcb_*
Inspired by rmind-smpnet patches.
|
#
1.70 |
|
28-Oct-2022 |
ozaki-r |
Adjust pf, wg, dccp and sctp for struct inpcb integration
|
Revision tags: bouyer-sunxi-drm-base
|
#
1.69 |
|
25-Mar-2022 |
hannken |
Prevent memory corruption from wg_send_handshake_msg_init() on LP64 machines with "MSIZE == 256", sparc64 for example.
wg_send_handshake_msg_init() tries to put 148 bytes into a buffer of 144 bytes and overwrites 4 bytes following the mbuf. Check for "sizeof() > MHLEN" and use a cluster in this case.
With help from Taylor R Campbell <riastradh@>
|
#
1.68 |
|
16-Jan-2022 |
riastradh |
wg(4): Limit the size of ifdrv requests.
Avoids potential integer overflow or kernel memory exhaustion.
Reported by Thomas Leroy a while back.
|
#
1.67 |
|
31-Dec-2021 |
riastradh |
sys: Use if_init wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.66 |
|
31-Dec-2021 |
riastradh |
sys: Use if_stop wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.71 |
|
04-Nov-2022 |
ozaki-r |
inpcb: rename functions to inpcb_*
Inspired by rmind-smpnet patches.
|
#
1.70 |
|
28-Oct-2022 |
ozaki-r |
Adjust pf, wg, dccp and sctp for struct inpcb integration
|
Revision tags: bouyer-sunxi-drm-base
|
#
1.69 |
|
25-Mar-2022 |
hannken |
Prevent memory corruption from wg_send_handshake_msg_init() on LP64 machines with "MSIZE == 256", sparc64 for example.
wg_send_handshake_msg_init() tries to put 148 bytes into a buffer of 144 bytes and overwrites 4 bytes following the mbuf. Check for "sizeof() > MHLEN" and use a cluster in this case.
With help from Taylor R Campbell <riastradh@>
|
#
1.68 |
|
16-Jan-2022 |
riastradh |
wg(4): Limit the size of ifdrv requests.
Avoids potential integer overflow or kernel memory exhaustion.
Reported by Thomas Leroy a while back.
|
#
1.67 |
|
31-Dec-2021 |
riastradh |
sys: Use if_init wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.66 |
|
31-Dec-2021 |
riastradh |
sys: Use if_stop wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.70 |
|
28-Oct-2022 |
ozaki-r |
Adjust pf, wg, dccp and sctp for struct inpcb integration
|
Revision tags: bouyer-sunxi-drm-base
|
#
1.69 |
|
25-Mar-2022 |
hannken |
Prevent memory corruption from wg_send_handshake_msg_init() on LP64 machines with "MSIZE == 256", sparc64 for example.
wg_send_handshake_msg_init() tries to put 148 bytes into a buffer of 144 bytes and overwrites 4 bytes following the mbuf. Check for "sizeof() > MHLEN" and use a cluster in this case.
With help from Taylor R Campbell <riastradh@>
|
#
1.68 |
|
16-Jan-2022 |
riastradh |
wg(4): Limit the size of ifdrv requests.
Avoids potential integer overflow or kernel memory exhaustion.
Reported by Thomas Leroy a while back.
|
#
1.67 |
|
31-Dec-2021 |
riastradh |
sys: Use if_init wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.66 |
|
31-Dec-2021 |
riastradh |
sys: Use if_stop wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.69 |
|
25-Mar-2022 |
hannken |
Prevent memory corruption from wg_send_handshake_msg_init() on LP64 machines with "MSIZE == 256", sparc64 for example.
wg_send_handshake_msg_init() tries to put 148 bytes into a buffer of 144 bytes and overwrites 4 bytes following the mbuf. Check for "sizeof() > MHLEN" and use a cluster in this case.
With help from Taylor R Campbell <riastradh@>
|
#
1.68 |
|
16-Jan-2022 |
riastradh |
wg(4): Limit the size of ifdrv requests.
Avoids potential integer overflow or kernel memory exhaustion.
Reported by Thomas Leroy a while back.
|
#
1.67 |
|
31-Dec-2021 |
riastradh |
sys: Use if_init wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.66 |
|
31-Dec-2021 |
riastradh |
sys: Use if_stop wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.68 |
|
16-Jan-2022 |
riastradh |
wg(4): Limit the size of ifdrv requests.
Avoids potential integer overflow or kernel memory exhaustion.
Reported by Thomas Leroy a while back.
|
#
1.67 |
|
31-Dec-2021 |
riastradh |
sys: Use if_init wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.66 |
|
31-Dec-2021 |
riastradh |
sys: Use if_stop wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.67 |
|
31-Dec-2021 |
riastradh |
sys: Use if_init wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.66 |
|
31-Dec-2021 |
riastradh |
sys: Use if_stop wrapper function.
Exception: Not in kern_pmf.c, for the kind of silly reason that it avoids having kern_pmf.c refer to symbols defined only in net; this avoids a pain in the rump.
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.65 |
|
17-Aug-2021 |
christos |
Some signnes, casts, and constant sizes. Add module dependencies.
|
Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base thorpej-i2c-spi-conf-base
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.64 |
|
16-Jun-2021 |
riastradh |
if_attach and if_initialize cannot fail, don't test return value
These were originally made failable back in 2017 when if_initialize allocated a softint in every interface for link state changes, so that it could fail gracefully instead of panicking:
https://mail-index.NetBSD.org/source-changes/2017/10/23/msg089053.html
However, this spawned many seldom- or never-tested error branches, which are risky to have around. And that softint in every interface has since been replaced by a single global workqueue, because link state changes require thread context but not low latency or high throughput:
https://mail-index.NetBSD.org/source-changes/2020/02/06/msg113759.html
So there is no longer any reason for if_initialize to fail. (The subroutine if_stats_init can't fail because percpu_alloc can't fail either.)
There is a snag: the softint_establish in if_percpuq_create could fail, potentially leading to bad consequences later on trying to use the softint. This change doesn't introduce any new bugs because of the snag -- if_percpuq_attach was already broken. However, the snag can be better addressed without spawning error branches, either by using a single softint or making softints less scarce.
(Separate commit will change the signatures of if_attach and if_initialize to return void, scheduled to ride whatever is the next convenient kernel bump.)
Patch and testing on amd64 and evbmips64-eb by maya@; commit message soliloquy, and compile-testing on evbppc/i386/earmv7hf, by me.
|
Revision tags: cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
branches: 1.62.4; wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.63 |
|
29-Apr-2021 |
riastradh |
Sprinkle __noinline to reduce gigantic stack frames in ALL kernels.
In principle this might just push a real problem around, but this is unlikely to be a real problem because:
1. The large stack frames are really only in the setup state machine message handlers, which run at the top loop of a thread with a shallow stack anyway.
2. If these are inlined, gcc might create multiple nonoverlapping stack buffers, whereas if not inlined, the stack frames from consecutive or alternative procedure calls would overlap anyway.
(I haven't investigated exactly what's going on leading to ~5 KB-byte stack frames, but this shuts gcc up, at least, and the hypotheses sound plausible to me!)
|
Revision tags: thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
#
1.61 |
|
15-Oct-2020 |
roy |
branches: 1.61.2; wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.62 |
|
11-Nov-2020 |
riastradh |
wg: Sprinkle #ifdef INET6. Avoid unconditional use of ip6 structs.
Fixes no-INET6 build.
Based on patch from Brad Spencer:
https://mail-index.NetBSD.org/current-users/2020/11/11/msg039883.html
|
Revision tags: thorpej-futex-base
|
#
1.61 |
|
15-Oct-2020 |
roy |
wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.61 |
|
15-Oct-2020 |
roy |
wg: with no peers, the link status is DOWN, otherwise UP
This mirrors the recent changes to gif(4) where the link is UP when a tunnel is set, otherwise DOWN.
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.60 |
|
14-Sep-2020 |
riastradh |
wg: Add altq hooks.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock, so it would serialize all transmission to all peers on a single wg(4) interface).
altq can be disabled at compile-time or at run-time; even if included at comple-time the run-time impact should be negligible if disabled.
|
#
1.59 |
|
13-Sep-2020 |
riastradh |
wg: Fix detach logic.
Not tested but this should be less of a rake to step on if anyone made an unloadable wg module.
|
#
1.58 |
|
13-Sep-2020 |
riastradh |
wg: Use RUN_ONCE to defer workqueue_create until after configure.
Should really fix workqueue(9) so workqueue_create can be done before CPUs have been detected in configure, but this will serve as a stop- gap measure.
|
#
1.57 |
|
13-Sep-2020 |
riastradh |
wg: Add missing kpreempt_disable/enable around pktq_enqueue.
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.56 |
|
08-Sep-2020 |
riastradh |
wg: Drop wgp_lock while waiting for endpoint psref to drain.
- This is safe because wgp_endpoint_changing locks out any attempts to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear that it's safe to just release wgp_lock there; may need to create a new session state, say WGS_STATE_DRAINING, while we wait for psref_target_destroy. But this needs a little more thought; a new state may not be necessary, and would be nice to avoid if not necessary.
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.55 |
|
07-Sep-2020 |
riastradh |
wg: Use threadpool(9) and workqueue(9) for asynchronous tasks.
- Using threadpool(9) job per interface to receive incoming handshake messages gives the same concurrency for active interfaces but doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's no bound on the amount of time wg_receive_packets() might run for; we really need separate threads or threadpool jobs in order to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK to share a global workqueue -- there's no advantage to adding concurrency for what is almost certainly going to be CPU-bound asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a list of all peers, so the task mechanism should no longer be a bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on the same CPU where the packet came in. Should consider doing something to balance the load -- maybe note if the current CPU is loaded, and if so, sort CPUs by queue length or some other measure of load and pick the least loaded one or something.
|
#
1.54 |
|
07-Sep-2020 |
riastradh |
wg: Use a global pktqueue rather than a per-peer pcq.
- Improves scalability -- won't hit limit on softints no matter how many peers there are. - Improves parallelism -- softint was kernel-locked to serialize access to the pcq. - Requires per-peer queue on handshake init to avoid dropping first packet. . Per-peer queue is currently a single packet -- should serve well enough for pings, dns queries, tcp connections, &c.
|
#
1.53 |
|
07-Sep-2020 |
riastradh |
wg: Fix debug output now that the priority is mixed into it.
|
#
1.52 |
|
07-Sep-2020 |
riastradh |
wg: Fix non-DIAGNOSTIC build.
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.51 |
|
31-Aug-2020 |
riastradh |
wg: Avoid memory leak if socreate fails.
|
#
1.50 |
|
31-Aug-2020 |
riastradh |
wg: Make it build with WG_DEBUG on 32-bit platforms.
|
#
1.49 |
|
31-Aug-2020 |
riastradh |
wg: Simplify locking.
Summary: Access to a stable established session is still allowed via psref; all other access to peer and session state is now serialized by struct wg_peer::wgp_lock, with no dancing around a per-session lock. This way, the handshake paths are locked, while the data transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be store-release.
|
#
1.48 |
|
31-Aug-2020 |
riastradh |
wg: M_NOWAIT -> M_DONTWAIT
These happen to be aliases, but M_NOWAIT is part of the legacy malloc API whereas M_DONTWAIT is part of the mbuf API.
|
#
1.47 |
|
31-Aug-2020 |
riastradh |
wg: wg_sockaddr audit.
- Ensure all access to struct wg_peer::wgp_endpoint happens while holding a psref.
- Simplify internalize/externalize logic and be more careful about verifying it before printing anything.
|
#
1.46 |
|
31-Aug-2020 |
riastradh |
wg: On INIT, do DH and decrypt timestamp before locking session.
This narrows the window when the session is unlocked. Really there should be no such window, but we'll finish getting rid of it later.
|
#
1.45 |
|
31-Aug-2020 |
riastradh |
wg: Verify or send cookie challenge before looking up session.
This step doesn't depend on the session, so let's avoid touching the session state until we've passed it.
|
#
1.44 |
|
31-Aug-2020 |
riastradh |
wg: Verify mac1 as the first step on INIT and RESP messages.
This avoids the expensive DH computation before the sender has proven knowledge of our public key.
|
#
1.43 |
|
31-Aug-2020 |
riastradh |
wg: Omit needless variable.
|
#
1.42 |
|
31-Aug-2020 |
riastradh |
wg: Switch to callout_stop for session destructor timer.
Can't release the lock here, and can't sleep waiting for the callout while we hold it without risking deadlock. But not waiting is fine; after we transition out of WGS_STATE_UNKNOWN the timer has no effect.
|
#
1.41 |
|
31-Aug-2020 |
riastradh |
wg: Fix indentation. No functional change.
|
#
1.40 |
|
31-Aug-2020 |
riastradh |
wg: Just call callout_halt directly.
No functional change, just makes it easier to read where callout_halt happens.
|
#
1.39 |
|
31-Aug-2020 |
riastradh |
wg: Fix byte order on wire.
Give this a chance to work on big-endian systems.
|
#
1.38 |
|
31-Aug-2020 |
riastradh |
wg: mbuf m_freem audit.
1. wg_handle_msg_data frees m but the other wg_handle_msg_* just take a pointer to the mbuf content and not m itself, so free m in those cases.
2. Can't trivially prove that the pcq is empty by the time wg_destroy_peer runs pcq_destroy, so let's explicitly purge it just in case.
3. If wg_send_udp isn't doing udp_send or udp6_output, it still has to free m in the !INET6 error branch for IPv6 packets.
4. After rumpuser_wg_send_peer or rumpuser_wg_send_user, we still need to free the mbuf.
|
#
1.37 |
|
31-Aug-2020 |
riastradh |
wg: Use thmap(9) for peer and session lookup.
Make sure we also don't trip over our own shoelaces by choosing the same session index twice.
|
#
1.36 |
|
31-Aug-2020 |
riastradh |
wg: XAEAD doesn't use a counter, so don't pass one.
|
#
1.35 |
|
31-Aug-2020 |
riastradh |
wg: Count down wg_npeers in wg_destroy_all_peers too.
Doesn't actually make a difference -- wg_destroy_all_peers is only used when we're destroying the wg instance altogether -- but let's not leave rakes to step on.
|
#
1.34 |
|
31-Aug-2020 |
riastradh |
wg: Note lock order.
|
#
1.33 |
|
31-Aug-2020 |
riastradh |
wg: Remove IFF_POINTOPOINT.
Unclear why this was set; setting it seems to have required a kludge in netinet/in.c that broke ipsec tunnels. Clearing it makes wg work again after that kludge was reverted.
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.32 |
|
28-Aug-2020 |
riastradh |
wg: Sort includes.
|
#
1.31 |
|
27-Aug-2020 |
tih |
Summary: let wg interfaces carry multicast traffic
Once a wg interface is up and running, it is useful to be able to run a routing protocol over it. Marking the interface multicast capable enables this. (One must also use the wgconfig --allowed-ips option to explicitly permit the group one needs, e.g. 224.0.0.5/32 for OSPF.)
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.30 |
|
27-Aug-2020 |
riastradh |
wg: Assert MCLBYTES is enough for requested length in wg_get_mbuf.
|
#
1.29 |
|
27-Aug-2020 |
riastradh |
wg: Make sure all paths into wg_handle_msg_data guarantee enough m_len.
Earlier commit moved the m_pullup into wg_validate_msg_header, but wg_overudp_cb doesn't go through that.
|
#
1.28 |
|
27-Aug-2020 |
riastradh |
wg: Drop invalid message types on the floor faster.
Don't even let them reach the thread -- drop them in softint.
|
#
1.27 |
|
27-Aug-2020 |
riastradh |
wg: KASSERT m_len before mtod.
XXX We should really make mtod do this automagically, and use something else for mtod(m, void *).
|
#
1.26 |
|
27-Aug-2020 |
riastradh |
wg: Use m_pullup to make message header contiguous before processing.
|
#
1.25 |
|
27-Aug-2020 |
riastradh |
wg: Check mbuf chain length before m_copydata.
|
#
1.24 |
|
26-Aug-2020 |
riastradh |
Clarify wg(4)'s relation to WireGuard, pending further discussion.
Still planning to replace wgconfig(8) and wg-keygen(8) by one wg(8) tool compatible with wireguard-tools; update wg(4) for the minor changes from the 2018-06-30 spec to the 2020-06-01 spec; &c. This just clarifies the current state of affairs as it exists in the development tree for now.
Mark the man page EXPERIMENTAL for extra clarity.
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|
#
1.23 |
|
23-Aug-2020 |
riastradh |
Initialize peers early on for error branch.
|
#
1.22 |
|
21-Aug-2020 |
riastradh |
Use lock rather than 64-bit atomics for platforms without the latter.
|
#
1.21 |
|
21-Aug-2020 |
riastradh |
Fix sysctl types.
- CTLTYPE_QUAD, not CTLTYPE_LONG, for uint64_t - use unsigned rather than time_t -- these are all short durations - clamp timeouts to be safe for conversion to int ticks in callout
Should fix 32-bit builds.
|
#
1.20 |
|
21-Aug-2020 |
riastradh |
Ifdef out fast path that relies on atomic 64-bit load/store.
(Really this sliding window business could probably be done with 32-bit sequence numbers and careful detection of wraparound, but that's a little more effort to work out -- let's just unbreak the builds for now.)
|
#
1.19 |
|
20-Aug-2020 |
riastradh |
Mark KASSERT-only variable as __diagused.
|
#
1.18 |
|
20-Aug-2020 |
riastradh |
Avoid callout_halt under lock.
- We could pass the lock in, except we hold another lock too.
- We could halt before taking the other lock, but it's not safe to sleep after getting the session pointer before taking its lock.
- We could halt before getting the session pointer, but then there's no point in doing it under the lock.
So just halt a little earlier instead.
|
#
1.17 |
|
20-Aug-2020 |
riastradh |
Sprinkle const.
|
#
1.16 |
|
20-Aug-2020 |
riastradh |
Use container_of rather than casts via void *.
|
#
1.15 |
|
20-Aug-2020 |
riastradh |
Use be32enc, rather than possibly unaligned uint32_t cast and htonl.
|
#
1.14 |
|
20-Aug-2020 |
riastradh |
KNF
|
#
1.13 |
|
20-Aug-2020 |
riastradh |
Use consttime_memequal, not memcmp, to compare secrets for equality.
|
#
1.12 |
|
20-Aug-2020 |
riastradh |
Take advantage of prop_dictionary_util(3).
|
#
1.11 |
|
20-Aug-2020 |
riastradh |
Split up wg_process_peer_tasks into bite-size functions.
|
#
1.10 |
|
20-Aug-2020 |
riastradh |
Fix race in wg_worker kthread destruction.
Also allow the thread to migrate between CPUs -- just not while we're in the middle of processing and holding onto things with psrefs.
|
#
1.9 |
|
20-Aug-2020 |
riastradh |
Update for proplib API changes.
|
#
1.8 |
|
20-Aug-2020 |
riastradh |
Use SYSCTL_SETUP for net.wireguard subtree.
|
#
1.7 |
|
20-Aug-2020 |
riastradh |
Fix in-kernel debug build.
|
#
1.6 |
|
20-Aug-2020 |
riastradh |
Implement sliding window for wireguard replay detection.
|
#
1.5 |
|
20-Aug-2020 |
riastradh |
Don't falsely assert cpu_softintr_p().
Will fail in the following stack trace:
wg_worker (kthread) wg_receive_packets wg_handle_packet wg_handle_msg_data KASSERT(cpu_softintr_p())
Instead, use kpreempt_disable/enable around softint_schedule.
XXX Not clear that softint is the right place to do this!
|
#
1.4 |
|
20-Aug-2020 |
riastradh |
Convert wg(4) to if_stat.
|
#
1.3 |
|
20-Aug-2020 |
riastradh |
Use cprng_strong, not cprng_fast, for ephemeral key.
|
#
1.2 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Fix bugs found by maxv's audits
|
#
1.1 |
|
20-Aug-2020 |
riastradh |
[ozaki-r] Add wg files
|