History log of /netbsd-current/sys/kern/kern_entropy.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.66 04-Oct-2023 ad

Eliminate l->l_ncsw and l->l_nivcsw. From memory think they were added
before we had per-LWP struct rusage; the same is now tracked there.


# 1.65 05-Aug-2023 riastradh

cprng(9): Drop and retake percpu reference across entropy_extract.

entropy_extract may sleep on an adaptive lock, which invalidates
percpu(9) references.

Add a note in the comment over entropy_extract about this.

Discovered by stumbling upon this panic during a test run:

[ 1.0200050] panic: kernel diagnostic assertion "(cprng == percpu_getref(cprng_fast_percpu)) && (percpu_putref(cprng_fast_percpu), true)" failed: file "/home/riastradh/netbsd/current/src/sys/rump/librump/rumpkern/../../../crypto/cprng_fast/cprng_fast.c", line 117

XXX pullup-10


# 1.64 04-Aug-2023 riastradh

entropy(9): Disable !cold assertion in rump for now.

Evidently rump starts threads before it sets cold = 0, and deferring
the call to rump_thread_allow(NULL) in rump.c rump_init_callback
until after setting cold = 0 doesn't work either -- rump kernels just
hang. To be investigated -- for now, let's just stop breaking
thousands of tests.


# 1.63 04-Aug-2023 riastradh

entropy(9): Simplify stages. Split interrupt vs non-interrupt paths.

- Nix the entropy stage (cold, warm, hot). Just use the usual kernel
`cold' (cold: single-core, single-thread; interrupts may happen),
and don't make any three-way distinction about whether interrupts
or threads or other CPUs can be running.

Instead, while cold, use splhigh/splx or forbid paths to come from
interrupt context, and while warm, use mutex or the per-CPU hard
and soft interrupt paths for low latency. This comes at a small
cost to some interrupt latency, since we may stir the pool in
interrupt context -- but only for a very short window early at boot
between configure and configure2, so it's hard to imagine it
matters much.

- Allow rnd_add_uint32 to run in hard interrupt context or with spin
locks held, but defer processing to softint and drop samples on the
floor if buffer is full. This is mainly used for cheaply tossing
samples from drivers for non-HWRNG devices into the entropy pool,
so it is often used from interrupt context and/or under spin locks.

- New rnd_add_data_intr provides the interrupt-like data entry path
for arbitrary buffers and driver-specified entropy estimates: defer
processing to softint and drop samples on the floor if buffer is
full.

- Document that rnd_add_data is forbidden under spin locks outside
interrupt context (will crash in LOCKDEBUG), and inadvisable in
interrupt context (but technically permitted just in case there are
compatibility issues for now); later we can forbid it altogether in
interrupt context or under spin locks.

- Audit all uses of rnd_add_data to use rnd_add_data_intr where it
might be used in interrupt context or under a spin lock.

This fixes a regression from last year when the global entropy lock
was changed from IPL_VM (spin) to IPL_SOFTSERIAL (adaptive). Thought
I'd caught all the problems from that, but another one bit three
different people this week, presumably because of recent changes that
led to more non-HWRNG drivers entering the entropy consolidation
path from rnd_add_uint32.

In my attempt to preserve the rnd(9) API for the (now long-since
abandoned) prospect of pullup to netbsd-9 in my rewrite of the
entropy subsystem in 2020, I didn't introduce a separate entry point
for entering entropy from interrupt context or equivalent, i.e., spin
locks held, and instead made rnd_add_data rely on cpu_intr_p() to
decide whether to process the whole sample under a lock or only take
as much as there's buffer space for before scheduling a softint. In
retrospect, that was a mistake (though perhaps not as much of a
mistake as other entropy API decisions...), a mistake which is
finally getting rectified now by rnd_add_data_intr.

XXX pullup-10


# 1.62 30-Jun-2023 riastradh

entropy(9): Reintroduce netbsd<=9 time-delta estimator for unblocking.

The system will (in a subsequent change) by default block for this
condition before almost all of userland is running (including
/etc/rc.d/sshd key generation). That way, a never-blocking
getentropy(3) API will never return any data without at least
best-effort entropy like netbsd<=9 did to applications except in
single-user mode (where you have to be careful about everything
anyway) or in the few processes that run before a seed can even be
loaded (where blocking indefinitely, e.g. when generating a stack
protector cookie in libc, could pose a severe availability problem
that can't be configured away, but where the security impact is low).

However, (in another subsequent change) we will continue to use
_only_ HWRNG driver estimates and seed estimates, and _not_
time-delta estimator, for _warning_ about security in motd, daily
security report, etc. And if HWRNG/seed provides enough entropy
before time-delta estimator does, that will unblock /dev/random too.

The result is:

- Machines with HWRNG or seed won't warn about entropy and will
essentially never block -- even on first boot without a seed, it
will take only as long as the fastest HWRNG to unblock.

- Machines with neither HWRNG nor seed:
. will warn about entropy, giving feedback about security;
and
. will avoid returning anything more predictable than netbsd<=9;
but
. won't block (much) longer than netbsd<=9 would (and won't block
again after blocking once, except with kern.entropy.depletion=1 for
testing).

(The threshold for unblocking is now somewhat higher than before:
512 samples that pass the time-delta estimator, rather than 80 as
it used to be.)

And, of course, adding a seed (or HWRNG) will prevent both warnings
and blocking.

The mechanism is:

1. /dev/random will block until _either_

(a) enough bits of entropy (256) from reliable sources have been
added to the pool, _or_

(b) enough samples have been added from any sources (512), passing
the old time-delta entropy estimator, that the possible
security benefit doesn't justify holding up availability any
longer (`best effort'), except on systems with higher security
requirements like securelevel=2 which can disable non-HWRNG,
non-seed sources with rndctl_flags in rc.conf(5).

2. dmesg will report `entropy: ready' when 1(a) is satisfied, but if
1(b) is satisfied first, it will report `entropy: best effort', so
the concise log messages will reflect the timing and whether in
any period of time any of the system might be relying on best
effort entropy.

3. The sysctl knob kern.entropy.needed (and the ioctl RNDGETPOOLSTAT
variable rndpoolstat_t::added) still reflects the number of bits
of entropy from reliable sources, so we can still use this to
suggest regenerating ssh keys.

This matters on platforms that can only be reached, after flashing
an installation image, by sshing in over a (private) network, like
small network appliances or remote virtual machines without
(interactive) serial consoles. If we blocked indefinitely at boot
when generating ssh keys, such platforms would be unusable. This
way, platforms are usable, but operators can still be advised at
login time to regenerate keys as soon as they can actually load
entropy onto the system, e.g. with rndctl(8) on a seed file copied
from a local machine over the (private) network.

4. On machines without HWRNG, using a seed file still suppresses
warnings for users who need more confident security. But it is no
longer necessary for availability.

This is a compromise between availability and security:

- The security mechanism of blocking indefinitely on machines without
HWRNG hurts availability too much, as painful experience over the
multiple years since I made the mistake of introducing it have
shown. (Sorry!)

- The other main alternative, not having a blocking path at all (as I
pushed for, and as OpenBSD has done for a long time) could
potentially reduce security vs netbsd<=9, and would run against the
expectations set by many popular operating systems to the severe
detriment of public perception of NetBSD security.

Even though we can't _confidently_ assess enough entropy from, e.g.,
sampling interrupt timings, this is the traditional behaviour that
most operating systems provide -- and the result here is a net
nondecrease in security over netbsd<=9, because all paths from the
entropy pool to userland now have at least as high a standard before
returning data as they did in netbsd<=9.

PR kern/55641
PR pkg/55847
PR kern/57185
https://mail-index.netbsd.org/current-users/2020/09/02/msg039470.html
https://mail-index.netbsd.org/current-users/2020/11/21/msg039931.html
https://mail-index.netbsd.org/current-users/2020/12/05/msg040019.html

XXX pullup-10


# 1.61 24-May-2023 riastradh

entropy(9): Avoid race between rnd_add_data and ioctl(RNDCTL).

XXX pullup-10


# 1.60 24-May-2023 riastradh

entropy(9): On flags change, cancel any scheduled consolidation.

We've been instructed to lose confidence in existing entropy sources,
so let's make sure to re-gather enough entropy before the next
consolidation can happen, in case some of what would be counted in
consolidation is from those entropy sources.

XXX pullup-10


# 1.59 03-Mar-2023 riastradh

entropy(9): Allow changing flags on all entropy sources at once.

Entropy sources should all have nonempty names, and this will enable
an operator to, for example, disable all but a specific entropy
source.

XXX pullup-10


# 1.58 01-Mar-2023 riastradh

random(4): Report number of bytes ready to read, not number of bits.

Only affects systems with the diagnostic and testing option
kern.entropy.depletion=1.

XXX pullup-10


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.57 05-Aug-2022 riastradh

branches: 1.57.4;
entropy: Don't disclose stack garbage in kern.entropy sysctls.

kern.entropy.consolidate and kern.entropy.gather are supposed to be
write-only -- it doesn't make any sense to read from them, but I
suppose it's better to read-as-zero than read-as-stack-secrets!


# 1.56 13-May-2022 riastradh

entropy(9): Update comment about where entropy_extract is allowed.

As of last month, it is forbidden in all hard interrupt context.


# 1.55 13-May-2022 riastradh

entropy(9): Note rules about how to use entropy_extract output.


# 1.54 24-Mar-2022 riastradh

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.


# 1.53 23-Mar-2022 riastradh

entropy(9): Include <sys/lwp.h> and <sys/proc.h> explicitly.

Now that we use curlwp, struct lwp::l_pflag, and LP_BOUND, let's not
rely on side-loads from other .h files.


# 1.52 23-Mar-2022 riastradh

entropy(9): Bind to CPU temporarily to avoid race with lwp migration.

More fallout from the IPL_VM->IPL_SOFTSERIAL change.

In entropy_enter, there is a window when the lwp can be migrated to
another CPU:

ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();

/* lwp migration possible here */

if (pending)
entropy_account_cpu(ec);

If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.

No need to do this in entropy_softintr because softints are bound to
the CPU anyway.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.65 05-Aug-2023 riastradh

cprng(9): Drop and retake percpu reference across entropy_extract.

entropy_extract may sleep on an adaptive lock, which invalidates
percpu(9) references.

Add a note in the comment over entropy_extract about this.

Discovered by stumbling upon this panic during a test run:

[ 1.0200050] panic: kernel diagnostic assertion "(cprng == percpu_getref(cprng_fast_percpu)) && (percpu_putref(cprng_fast_percpu), true)" failed: file "/home/riastradh/netbsd/current/src/sys/rump/librump/rumpkern/../../../crypto/cprng_fast/cprng_fast.c", line 117

XXX pullup-10


# 1.64 04-Aug-2023 riastradh

entropy(9): Disable !cold assertion in rump for now.

Evidently rump starts threads before it sets cold = 0, and deferring
the call to rump_thread_allow(NULL) in rump.c rump_init_callback
until after setting cold = 0 doesn't work either -- rump kernels just
hang. To be investigated -- for now, let's just stop breaking
thousands of tests.


# 1.63 04-Aug-2023 riastradh

entropy(9): Simplify stages. Split interrupt vs non-interrupt paths.

- Nix the entropy stage (cold, warm, hot). Just use the usual kernel
`cold' (cold: single-core, single-thread; interrupts may happen),
and don't make any three-way distinction about whether interrupts
or threads or other CPUs can be running.

Instead, while cold, use splhigh/splx or forbid paths to come from
interrupt context, and while warm, use mutex or the per-CPU hard
and soft interrupt paths for low latency. This comes at a small
cost to some interrupt latency, since we may stir the pool in
interrupt context -- but only for a very short window early at boot
between configure and configure2, so it's hard to imagine it
matters much.

- Allow rnd_add_uint32 to run in hard interrupt context or with spin
locks held, but defer processing to softint and drop samples on the
floor if buffer is full. This is mainly used for cheaply tossing
samples from drivers for non-HWRNG devices into the entropy pool,
so it is often used from interrupt context and/or under spin locks.

- New rnd_add_data_intr provides the interrupt-like data entry path
for arbitrary buffers and driver-specified entropy estimates: defer
processing to softint and drop samples on the floor if buffer is
full.

- Document that rnd_add_data is forbidden under spin locks outside
interrupt context (will crash in LOCKDEBUG), and inadvisable in
interrupt context (but technically permitted just in case there are
compatibility issues for now); later we can forbid it altogether in
interrupt context or under spin locks.

- Audit all uses of rnd_add_data to use rnd_add_data_intr where it
might be used in interrupt context or under a spin lock.

This fixes a regression from last year when the global entropy lock
was changed from IPL_VM (spin) to IPL_SOFTSERIAL (adaptive). Thought
I'd caught all the problems from that, but another one bit three
different people this week, presumably because of recent changes that
led to more non-HWRNG drivers entering the entropy consolidation
path from rnd_add_uint32.

In my attempt to preserve the rnd(9) API for the (now long-since
abandoned) prospect of pullup to netbsd-9 in my rewrite of the
entropy subsystem in 2020, I didn't introduce a separate entry point
for entering entropy from interrupt context or equivalent, i.e., spin
locks held, and instead made rnd_add_data rely on cpu_intr_p() to
decide whether to process the whole sample under a lock or only take
as much as there's buffer space for before scheduling a softint. In
retrospect, that was a mistake (though perhaps not as much of a
mistake as other entropy API decisions...), a mistake which is
finally getting rectified now by rnd_add_data_intr.

XXX pullup-10


# 1.62 30-Jun-2023 riastradh

entropy(9): Reintroduce netbsd<=9 time-delta estimator for unblocking.

The system will (in a subsequent change) by default block for this
condition before almost all of userland is running (including
/etc/rc.d/sshd key generation). That way, a never-blocking
getentropy(3) API will never return any data without at least
best-effort entropy like netbsd<=9 did to applications except in
single-user mode (where you have to be careful about everything
anyway) or in the few processes that run before a seed can even be
loaded (where blocking indefinitely, e.g. when generating a stack
protector cookie in libc, could pose a severe availability problem
that can't be configured away, but where the security impact is low).

However, (in another subsequent change) we will continue to use
_only_ HWRNG driver estimates and seed estimates, and _not_
time-delta estimator, for _warning_ about security in motd, daily
security report, etc. And if HWRNG/seed provides enough entropy
before time-delta estimator does, that will unblock /dev/random too.

The result is:

- Machines with HWRNG or seed won't warn about entropy and will
essentially never block -- even on first boot without a seed, it
will take only as long as the fastest HWRNG to unblock.

- Machines with neither HWRNG nor seed:
. will warn about entropy, giving feedback about security;
and
. will avoid returning anything more predictable than netbsd<=9;
but
. won't block (much) longer than netbsd<=9 would (and won't block
again after blocking once, except with kern.entropy.depletion=1 for
testing).

(The threshold for unblocking is now somewhat higher than before:
512 samples that pass the time-delta estimator, rather than 80 as
it used to be.)

And, of course, adding a seed (or HWRNG) will prevent both warnings
and blocking.

The mechanism is:

1. /dev/random will block until _either_

(a) enough bits of entropy (256) from reliable sources have been
added to the pool, _or_

(b) enough samples have been added from any sources (512), passing
the old time-delta entropy estimator, that the possible
security benefit doesn't justify holding up availability any
longer (`best effort'), except on systems with higher security
requirements like securelevel=2 which can disable non-HWRNG,
non-seed sources with rndctl_flags in rc.conf(5).

2. dmesg will report `entropy: ready' when 1(a) is satisfied, but if
1(b) is satisfied first, it will report `entropy: best effort', so
the concise log messages will reflect the timing and whether in
any period of time any of the system might be relying on best
effort entropy.

3. The sysctl knob kern.entropy.needed (and the ioctl RNDGETPOOLSTAT
variable rndpoolstat_t::added) still reflects the number of bits
of entropy from reliable sources, so we can still use this to
suggest regenerating ssh keys.

This matters on platforms that can only be reached, after flashing
an installation image, by sshing in over a (private) network, like
small network appliances or remote virtual machines without
(interactive) serial consoles. If we blocked indefinitely at boot
when generating ssh keys, such platforms would be unusable. This
way, platforms are usable, but operators can still be advised at
login time to regenerate keys as soon as they can actually load
entropy onto the system, e.g. with rndctl(8) on a seed file copied
from a local machine over the (private) network.

4. On machines without HWRNG, using a seed file still suppresses
warnings for users who need more confident security. But it is no
longer necessary for availability.

This is a compromise between availability and security:

- The security mechanism of blocking indefinitely on machines without
HWRNG hurts availability too much, as painful experience over the
multiple years since I made the mistake of introducing it have
shown. (Sorry!)

- The other main alternative, not having a blocking path at all (as I
pushed for, and as OpenBSD has done for a long time) could
potentially reduce security vs netbsd<=9, and would run against the
expectations set by many popular operating systems to the severe
detriment of public perception of NetBSD security.

Even though we can't _confidently_ assess enough entropy from, e.g.,
sampling interrupt timings, this is the traditional behaviour that
most operating systems provide -- and the result here is a net
nondecrease in security over netbsd<=9, because all paths from the
entropy pool to userland now have at least as high a standard before
returning data as they did in netbsd<=9.

PR kern/55641
PR pkg/55847
PR kern/57185
https://mail-index.netbsd.org/current-users/2020/09/02/msg039470.html
https://mail-index.netbsd.org/current-users/2020/11/21/msg039931.html
https://mail-index.netbsd.org/current-users/2020/12/05/msg040019.html

XXX pullup-10


# 1.61 24-May-2023 riastradh

entropy(9): Avoid race between rnd_add_data and ioctl(RNDCTL).

XXX pullup-10


# 1.60 24-May-2023 riastradh

entropy(9): On flags change, cancel any scheduled consolidation.

We've been instructed to lose confidence in existing entropy sources,
so let's make sure to re-gather enough entropy before the next
consolidation can happen, in case some of what would be counted in
consolidation is from those entropy sources.

XXX pullup-10


# 1.59 03-Mar-2023 riastradh

entropy(9): Allow changing flags on all entropy sources at once.

Entropy sources should all have nonempty names, and this will enable
an operator to, for example, disable all but a specific entropy
source.

XXX pullup-10


# 1.58 01-Mar-2023 riastradh

random(4): Report number of bytes ready to read, not number of bits.

Only affects systems with the diagnostic and testing option
kern.entropy.depletion=1.

XXX pullup-10


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.57 05-Aug-2022 riastradh

branches: 1.57.4;
entropy: Don't disclose stack garbage in kern.entropy sysctls.

kern.entropy.consolidate and kern.entropy.gather are supposed to be
write-only -- it doesn't make any sense to read from them, but I
suppose it's better to read-as-zero than read-as-stack-secrets!


# 1.56 13-May-2022 riastradh

entropy(9): Update comment about where entropy_extract is allowed.

As of last month, it is forbidden in all hard interrupt context.


# 1.55 13-May-2022 riastradh

entropy(9): Note rules about how to use entropy_extract output.


# 1.54 24-Mar-2022 riastradh

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.


# 1.53 23-Mar-2022 riastradh

entropy(9): Include <sys/lwp.h> and <sys/proc.h> explicitly.

Now that we use curlwp, struct lwp::l_pflag, and LP_BOUND, let's not
rely on side-loads from other .h files.


# 1.52 23-Mar-2022 riastradh

entropy(9): Bind to CPU temporarily to avoid race with lwp migration.

More fallout from the IPL_VM->IPL_SOFTSERIAL change.

In entropy_enter, there is a window when the lwp can be migrated to
another CPU:

ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();

/* lwp migration possible here */

if (pending)
entropy_account_cpu(ec);

If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.

No need to do this in entropy_softintr because softints are bound to
the CPU anyway.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.62 30-Jun-2023 riastradh

entropy(9): Reintroduce netbsd<=9 time-delta estimator for unblocking.

The system will (in a subsequent change) by default block for this
condition before almost all of userland is running (including
/etc/rc.d/sshd key generation). That way, a never-blocking
getentropy(3) API will never return any data without at least
best-effort entropy like netbsd<=9 did to applications except in
single-user mode (where you have to be careful about everything
anyway) or in the few processes that run before a seed can even be
loaded (where blocking indefinitely, e.g. when generating a stack
protector cookie in libc, could pose a severe availability problem
that can't be configured away, but where the security impact is low).

However, (in another subsequent change) we will continue to use
_only_ HWRNG driver estimates and seed estimates, and _not_
time-delta estimator, for _warning_ about security in motd, daily
security report, etc. And if HWRNG/seed provides enough entropy
before time-delta estimator does, that will unblock /dev/random too.

The result is:

- Machines with HWRNG or seed won't warn about entropy and will
essentially never block -- even on first boot without a seed, it
will take only as long as the fastest HWRNG to unblock.

- Machines with neither HWRNG nor seed:
. will warn about entropy, giving feedback about security;
and
. will avoid returning anything more predictable than netbsd<=9;
but
. won't block (much) longer than netbsd<=9 would (and won't block
again after blocking once, except with kern.entropy.depletion=1 for
testing).

(The threshold for unblocking is now somewhat higher than before:
512 samples that pass the time-delta estimator, rather than 80 as
it used to be.)

And, of course, adding a seed (or HWRNG) will prevent both warnings
and blocking.

The mechanism is:

1. /dev/random will block until _either_

(a) enough bits of entropy (256) from reliable sources have been
added to the pool, _or_

(b) enough samples have been added from any sources (512), passing
the old time-delta entropy estimator, that the possible
security benefit doesn't justify holding up availability any
longer (`best effort'), except on systems with higher security
requirements like securelevel=2 which can disable non-HWRNG,
non-seed sources with rndctl_flags in rc.conf(5).

2. dmesg will report `entropy: ready' when 1(a) is satisfied, but if
1(b) is satisfied first, it will report `entropy: best effort', so
the concise log messages will reflect the timing and whether in
any period of time any of the system might be relying on best
effort entropy.

3. The sysctl knob kern.entropy.needed (and the ioctl RNDGETPOOLSTAT
variable rndpoolstat_t::added) still reflects the number of bits
of entropy from reliable sources, so we can still use this to
suggest regenerating ssh keys.

This matters on platforms that can only be reached, after flashing
an installation image, by sshing in over a (private) network, like
small network appliances or remote virtual machines without
(interactive) serial consoles. If we blocked indefinitely at boot
when generating ssh keys, such platforms would be unusable. This
way, platforms are usable, but operators can still be advised at
login time to regenerate keys as soon as they can actually load
entropy onto the system, e.g. with rndctl(8) on a seed file copied
from a local machine over the (private) network.

4. On machines without HWRNG, using a seed file still suppresses
warnings for users who need more confident security. But it is no
longer necessary for availability.

This is a compromise between availability and security:

- The security mechanism of blocking indefinitely on machines without
HWRNG hurts availability too much, as painful experience over the
multiple years since I made the mistake of introducing it have
shown. (Sorry!)

- The other main alternative, not having a blocking path at all (as I
pushed for, and as OpenBSD has done for a long time) could
potentially reduce security vs netbsd<=9, and would run against the
expectations set by many popular operating systems to the severe
detriment of public perception of NetBSD security.

Even though we can't _confidently_ assess enough entropy from, e.g.,
sampling interrupt timings, this is the traditional behaviour that
most operating systems provide -- and the result here is a net
nondecrease in security over netbsd<=9, because all paths from the
entropy pool to userland now have at least as high a standard before
returning data as they did in netbsd<=9.

PR kern/55641
PR pkg/55847
PR kern/57185
https://mail-index.netbsd.org/current-users/2020/09/02/msg039470.html
https://mail-index.netbsd.org/current-users/2020/11/21/msg039931.html
https://mail-index.netbsd.org/current-users/2020/12/05/msg040019.html

XXX pullup-10


# 1.61 24-May-2023 riastradh

entropy(9): Avoid race between rnd_add_data and ioctl(RNDCTL).

XXX pullup-10


# 1.60 24-May-2023 riastradh

entropy(9): On flags change, cancel any scheduled consolidation.

We've been instructed to lose confidence in existing entropy sources,
so let's make sure to re-gather enough entropy before the next
consolidation can happen, in case some of what would be counted in
consolidation is from those entropy sources.

XXX pullup-10


# 1.59 03-Mar-2023 riastradh

entropy(9): Allow changing flags on all entropy sources at once.

Entropy sources should all have nonempty names, and this will enable
an operator to, for example, disable all but a specific entropy
source.

XXX pullup-10


# 1.58 01-Mar-2023 riastradh

random(4): Report number of bytes ready to read, not number of bits.

Only affects systems with the diagnostic and testing option
kern.entropy.depletion=1.

XXX pullup-10


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.57 05-Aug-2022 riastradh

entropy: Don't disclose stack garbage in kern.entropy sysctls.

kern.entropy.consolidate and kern.entropy.gather are supposed to be
write-only -- it doesn't make any sense to read from them, but I
suppose it's better to read-as-zero than read-as-stack-secrets!


# 1.56 13-May-2022 riastradh

entropy(9): Update comment about where entropy_extract is allowed.

As of last month, it is forbidden in all hard interrupt context.


# 1.55 13-May-2022 riastradh

entropy(9): Note rules about how to use entropy_extract output.


# 1.54 24-Mar-2022 riastradh

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.


# 1.53 23-Mar-2022 riastradh

entropy(9): Include <sys/lwp.h> and <sys/proc.h> explicitly.

Now that we use curlwp, struct lwp::l_pflag, and LP_BOUND, let's not
rely on side-loads from other .h files.


# 1.52 23-Mar-2022 riastradh

entropy(9): Bind to CPU temporarily to avoid race with lwp migration.

More fallout from the IPL_VM->IPL_SOFTSERIAL change.

In entropy_enter, there is a window when the lwp can be migrated to
another CPU:

ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();

/* lwp migration possible here */

if (pending)
entropy_account_cpu(ec);

If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.

No need to do this in entropy_softintr because softints are bound to
the CPU anyway.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.62 30-Jun-2023 riastradh

entropy(9): Reintroduce netbsd<=9 time-delta estimator for unblocking.

The system will (in a subsequent change) by default block for this
condition before almost all of userland is running (including
/etc/rc.d/sshd key generation). That way, a never-blocking
getentropy(3) API will never return any data without at least
best-effort entropy like netbsd<=9 did to applications except in
single-user mode (where you have to be careful about everything
anyway) or in the few processes that run before a seed can even be
loaded (where blocking indefinitely, e.g. when generating a stack
protector cookie in libc, could pose a severe availability problem
that can't be configured away, but where the security impact is low).

However, (in another subsequent change) we will continue to use
_only_ HWRNG driver estimates and seed estimates, and _not_
time-delta estimator, for _warning_ about security in motd, daily
security report, etc. And if HWRNG/seed provides enough entropy
before time-delta estimator does, that will unblock /dev/random too.

The result is:

- Machines with HWRNG or seed won't warn about entropy and will
essentially never block -- even on first boot without a seed, it
will take only as long as the fastest HWRNG to unblock.

- Machines with neither HWRNG nor seed:
. will warn about entropy, giving feedback about security;
and
. will avoid returning anything more predictable than netbsd<=9;
but
. won't block (much) longer than netbsd<=9 would (and won't block
again after blocking once, except with kern.entropy.depletion=1 for
testing).

(The threshold for unblocking is now somewhat higher than before:
512 samples that pass the time-delta estimator, rather than 80 as
it used to be.)

And, of course, adding a seed (or HWRNG) will prevent both warnings
and blocking.

The mechanism is:

1. /dev/random will block until _either_

(a) enough bits of entropy (256) from reliable sources have been
added to the pool, _or_

(b) enough samples have been added from any sources (512), passing
the old time-delta entropy estimator, that the possible
security benefit doesn't justify holding up availability any
longer (`best effort'), except on systems with higher security
requirements like securelevel=2 which can disable non-HWRNG,
non-seed sources with rndctl_flags in rc.conf(5).

2. dmesg will report `entropy: ready' when 1(a) is satisfied, but if
1(b) is satisfied first, it will report `entropy: best effort', so
the concise log messages will reflect the timing and whether in
any period of time any of the system might be relying on best
effort entropy.

3. The sysctl knob kern.entropy.needed (and the ioctl RNDGETPOOLSTAT
variable rndpoolstat_t::added) still reflects the number of bits
of entropy from reliable sources, so we can still use this to
suggest regenerating ssh keys.

This matters on platforms that can only be reached, after flashing
an installation image, by sshing in over a (private) network, like
small network appliances or remote virtual machines without
(interactive) serial consoles. If we blocked indefinitely at boot
when generating ssh keys, such platforms would be unusable. This
way, platforms are usable, but operators can still be advised at
login time to regenerate keys as soon as they can actually load
entropy onto the system, e.g. with rndctl(8) on a seed file copied
from a local machine over the (private) network.

4. On machines without HWRNG, using a seed file still suppresses
warnings for users who need more confident security. But it is no
longer necessary for availability.

This is a compromise between availability and security:

- The security mechanism of blocking indefinitely on machines without
HWRNG hurts availability too much, as painful experience over the
multiple years since I made the mistake of introducing it have
shown. (Sorry!)

- The other main alternative, not having a blocking path at all (as I
pushed for, and as OpenBSD has done for a long time) could
potentially reduce security vs netbsd<=9, and would run against the
expectations set by many popular operating systems to the severe
detriment of public perception of NetBSD security.

Even though we can't _confidently_ assess enough entropy from, e.g.,
sampling interrupt timings, this is the traditional behaviour that
most operating systems provide -- and the result here is a net
nondecrease in security over netbsd<=9, because all paths from the
entropy pool to userland now have at least as high a standard before
returning data as they did in netbsd<=9.

PR kern/55641
PR pkg/55847
PR kern/57185
https://mail-index.netbsd.org/current-users/2020/09/02/msg039470.html
https://mail-index.netbsd.org/current-users/2020/11/21/msg039931.html
https://mail-index.netbsd.org/current-users/2020/12/05/msg040019.html

XXX pullup-10


# 1.61 24-May-2023 riastradh

entropy(9): Avoid race between rnd_add_data and ioctl(RNDCTL).

XXX pullup-10


# 1.60 24-May-2023 riastradh

entropy(9): On flags change, cancel any scheduled consolidation.

We've been instructed to lose confidence in existing entropy sources,
so let's make sure to re-gather enough entropy before the next
consolidation can happen, in case some of what would be counted in
consolidation is from those entropy sources.

XXX pullup-10


# 1.59 03-Mar-2023 riastradh

entropy(9): Allow changing flags on all entropy sources at once.

Entropy sources should all have nonempty names, and this will enable
an operator to, for example, disable all but a specific entropy
source.

XXX pullup-10


# 1.58 01-Mar-2023 riastradh

random(4): Report number of bytes ready to read, not number of bits.

Only affects systems with the diagnostic and testing option
kern.entropy.depletion=1.

XXX pullup-10


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.57 05-Aug-2022 riastradh

entropy: Don't disclose stack garbage in kern.entropy sysctls.

kern.entropy.consolidate and kern.entropy.gather are supposed to be
write-only -- it doesn't make any sense to read from them, but I
suppose it's better to read-as-zero than read-as-stack-secrets!


# 1.56 13-May-2022 riastradh

entropy(9): Update comment about where entropy_extract is allowed.

As of last month, it is forbidden in all hard interrupt context.


# 1.55 13-May-2022 riastradh

entropy(9): Note rules about how to use entropy_extract output.


# 1.54 24-Mar-2022 riastradh

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.


# 1.53 23-Mar-2022 riastradh

entropy(9): Include <sys/lwp.h> and <sys/proc.h> explicitly.

Now that we use curlwp, struct lwp::l_pflag, and LP_BOUND, let's not
rely on side-loads from other .h files.


# 1.52 23-Mar-2022 riastradh

entropy(9): Bind to CPU temporarily to avoid race with lwp migration.

More fallout from the IPL_VM->IPL_SOFTSERIAL change.

In entropy_enter, there is a window when the lwp can be migrated to
another CPU:

ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();

/* lwp migration possible here */

if (pending)
entropy_account_cpu(ec);

If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.

No need to do this in entropy_softintr because softints are bound to
the CPU anyway.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.59 03-Mar-2023 riastradh

entropy(9): Allow changing flags on all entropy sources at once.

Entropy sources should all have nonempty names, and this will enable
an operator to, for example, disable all but a specific entropy
source.

XXX pullup-10


# 1.58 01-Mar-2023 riastradh

random(4): Report number of bytes ready to read, not number of bits.

Only affects systems with the diagnostic and testing option
kern.entropy.depletion=1.

XXX pullup-10


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.57 05-Aug-2022 riastradh

entropy: Don't disclose stack garbage in kern.entropy sysctls.

kern.entropy.consolidate and kern.entropy.gather are supposed to be
write-only -- it doesn't make any sense to read from them, but I
suppose it's better to read-as-zero than read-as-stack-secrets!


# 1.56 13-May-2022 riastradh

entropy(9): Update comment about where entropy_extract is allowed.

As of last month, it is forbidden in all hard interrupt context.


# 1.55 13-May-2022 riastradh

entropy(9): Note rules about how to use entropy_extract output.


# 1.54 24-Mar-2022 riastradh

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.


# 1.53 23-Mar-2022 riastradh

entropy(9): Include <sys/lwp.h> and <sys/proc.h> explicitly.

Now that we use curlwp, struct lwp::l_pflag, and LP_BOUND, let's not
rely on side-loads from other .h files.


# 1.52 23-Mar-2022 riastradh

entropy(9): Bind to CPU temporarily to avoid race with lwp migration.

More fallout from the IPL_VM->IPL_SOFTSERIAL change.

In entropy_enter, there is a window when the lwp can be migrated to
another CPU:

ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();

/* lwp migration possible here */

if (pending)
entropy_account_cpu(ec);

If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.

No need to do this in entropy_softintr because softints are bound to
the CPU anyway.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.58 01-Mar-2023 riastradh

random(4): Report number of bytes ready to read, not number of bits.

Only affects systems with the diagnostic and testing option
kern.entropy.depletion=1.

XXX pullup-10


Revision tags: netbsd-10-base bouyer-sunxi-drm-base
# 1.57 05-Aug-2022 riastradh

entropy: Don't disclose stack garbage in kern.entropy sysctls.

kern.entropy.consolidate and kern.entropy.gather are supposed to be
write-only -- it doesn't make any sense to read from them, but I
suppose it's better to read-as-zero than read-as-stack-secrets!


# 1.56 13-May-2022 riastradh

entropy(9): Update comment about where entropy_extract is allowed.

As of last month, it is forbidden in all hard interrupt context.


# 1.55 13-May-2022 riastradh

entropy(9): Note rules about how to use entropy_extract output.


# 1.54 24-Mar-2022 riastradh

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.


# 1.53 23-Mar-2022 riastradh

entropy(9): Include <sys/lwp.h> and <sys/proc.h> explicitly.

Now that we use curlwp, struct lwp::l_pflag, and LP_BOUND, let's not
rely on side-loads from other .h files.


# 1.52 23-Mar-2022 riastradh

entropy(9): Bind to CPU temporarily to avoid race with lwp migration.

More fallout from the IPL_VM->IPL_SOFTSERIAL change.

In entropy_enter, there is a window when the lwp can be migrated to
another CPU:

ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();

/* lwp migration possible here */

if (pending)
entropy_account_cpu(ec);

If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.

No need to do this in entropy_softintr because softints are bound to
the CPU anyway.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.57 05-Aug-2022 riastradh

entropy: Don't disclose stack garbage in kern.entropy sysctls.

kern.entropy.consolidate and kern.entropy.gather are supposed to be
write-only -- it doesn't make any sense to read from them, but I
suppose it's better to read-as-zero than read-as-stack-secrets!


# 1.56 13-May-2022 riastradh

entropy(9): Update comment about where entropy_extract is allowed.

As of last month, it is forbidden in all hard interrupt context.


# 1.55 13-May-2022 riastradh

entropy(9): Note rules about how to use entropy_extract output.


# 1.54 24-Mar-2022 riastradh

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.


# 1.53 23-Mar-2022 riastradh

entropy(9): Include <sys/lwp.h> and <sys/proc.h> explicitly.

Now that we use curlwp, struct lwp::l_pflag, and LP_BOUND, let's not
rely on side-loads from other .h files.


# 1.52 23-Mar-2022 riastradh

entropy(9): Bind to CPU temporarily to avoid race with lwp migration.

More fallout from the IPL_VM->IPL_SOFTSERIAL change.

In entropy_enter, there is a window when the lwp can be migrated to
another CPU:

ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();

/* lwp migration possible here */

if (pending)
entropy_account_cpu(ec);

If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.

No need to do this in entropy_softintr because softints are bound to
the CPU anyway.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.56 13-May-2022 riastradh

entropy(9): Update comment about where entropy_extract is allowed.

As of last month, it is forbidden in all hard interrupt context.


# 1.55 13-May-2022 riastradh

entropy(9): Note rules about how to use entropy_extract output.


# 1.54 24-Mar-2022 riastradh

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.


# 1.53 23-Mar-2022 riastradh

entropy(9): Include <sys/lwp.h> and <sys/proc.h> explicitly.

Now that we use curlwp, struct lwp::l_pflag, and LP_BOUND, let's not
rely on side-loads from other .h files.


# 1.52 23-Mar-2022 riastradh

entropy(9): Bind to CPU temporarily to avoid race with lwp migration.

More fallout from the IPL_VM->IPL_SOFTSERIAL change.

In entropy_enter, there is a window when the lwp can be migrated to
another CPU:

ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();

/* lwp migration possible here */

if (pending)
entropy_account_cpu(ec);

If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.

No need to do this in entropy_softintr because softints are bound to
the CPU anyway.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.54 24-Mar-2022 riastradh

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.


# 1.53 23-Mar-2022 riastradh

entropy(9): Include <sys/lwp.h> and <sys/proc.h> explicitly.

Now that we use curlwp, struct lwp::l_pflag, and LP_BOUND, let's not
rely on side-loads from other .h files.


# 1.52 23-Mar-2022 riastradh

entropy(9): Bind to CPU temporarily to avoid race with lwp migration.

More fallout from the IPL_VM->IPL_SOFTSERIAL change.

In entropy_enter, there is a window when the lwp can be migrated to
another CPU:

ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();

/* lwp migration possible here */

if (pending)
entropy_account_cpu(ec);

If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.

No need to do this in entropy_softintr because softints are bound to
the CPU anyway.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.54 24-Mar-2022 riastradh

entropy(9): Call entropy_softintr while bound to CPU.

It looks like We tripped on the new assertion in entropy_account_cpu
when there was pending entropy on cpu0 running lwp0 when xc_broadcast
ran -- since xc_broadcast calls the function directly rather than
calling it through softint_schedule, it's not called via the softint
lwp which would satisfy the assertion.


# 1.53 23-Mar-2022 riastradh

entropy(9): Include <sys/lwp.h> and <sys/proc.h> explicitly.

Now that we use curlwp, struct lwp::l_pflag, and LP_BOUND, let's not
rely on side-loads from other .h files.


# 1.52 23-Mar-2022 riastradh

entropy(9): Bind to CPU temporarily to avoid race with lwp migration.

More fallout from the IPL_VM->IPL_SOFTSERIAL change.

In entropy_enter, there is a window when the lwp can be migrated to
another CPU:

ec = entropy_cpu_get();
...
pending = ec->ec_pending + ...;
...
entropy_cpu_put();

/* lwp migration possible here */

if (pending)
entropy_account_cpu(ec);

If this happens, we may trip over any of several problems in
entropy_account_cpu because it assumes ec is the current CPU's state
in order to decide whether we have anything to contribute from the
local pool to the global pool.

No need to do this in entropy_softintr because softints are bound to
the CPU anyway.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.51 21-Mar-2022 riastradh

entropy(9): Make rnd_lock_sources work while cold.

x86 uses entropy_extract verrrrrry early. Fixes mistake in previous
that did not manifest in my testing on aarch64, which does not use it
so early.


# 1.50 20-Mar-2022 riastradh

entropy(9): Improve entropy warning messages and documentation.

- For the main warning message, use less jargon, say `security', and
cite the entropy(7) man page for further reading. Document this in
rnd(4) and entropy(7).

- For the debug-only warning message, say `entropy' only once and omit
it from the rnd(4) man page -- it's not very important unless you're
debugging the kernel in which case you probably know what you're
doing enough to not need the text explained in the man page.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.49 20-Mar-2022 riastradh

entropy(9): Fix premature optimization deadlock in entropy_request.

- For synchronous queries from /dev/random, which are waiting for
entropy to be ready, wait for concurrent access -- e.g., concurrent
rnd_detach_source -- to finish, and make sure to request entropy
from all sources (unless we're interrupted by a signal).

- When invoked through softint context (e.g., cprng_fast_intr ->
cprng_strong -> entropy_extract), don't wait, because we're
forbidden from waiting anyway.

- For entropy_bootrequest, wait but don't bother failing on signal
because this only happens in kthread context, not in userland
process context, so there can't be signals.

Nix rnd_trylock_sources; use the same entropy_extract flags
(ENTROPY_WAIT, ENTROPY_SIG) for rnd_lock_sources.


# 1.48 20-Mar-2022 riastradh

Revert "entropy(9): Nix rnd_trylock_sources."

Not a premature optimization after all -- this is necessary because
entropy_request can run in softint context, where the cv_wait_sig in
rnd_lock_sources is forbidden. Need to do this another way.


# 1.47 20-Mar-2022 riastradh

entropy(9): Nix rnd_trylock_sources.

This was a premature optimization that turned out to be bogus. It's
not harmful to request more than we need from drivers, so let's not
go out of our way to avoid that.


# 1.46 20-Mar-2022 riastradh

entropy(9): Fix another new race in entropy_account_cpu.

The consolidation xcall can preempt entropy_enter, between when it
unlocks the per-CPU state and when it calls entropy_account_cpu, with
the effect of setting ec->ec_pending=0.

Previously this was impossible because we called entropy_account_cpu
with the per-CPU state still locked, but that doesn't work now that
the global entropy lock is an adaptive lock which might sleep which
is forbidden while the per-CPU state is locked.


# 1.45 20-Mar-2022 riastradh

entropy(9): Shuffle some assertions around.

Tripped over (diff || E->pending == ENTROPY_CAPACITY*NBBY), not sure
why yet, printing values will help.

No functional change intended.


# 1.44 20-Mar-2022 riastradh

entropy(9): Lock the per-CPU state in entropy_account_cpu.

This was previously called with the per-CPU state locked, which
worked fine as long as the global entropy lock was a spin lock so
acquiring it would never sleep. Now it's an adaptive lock, so it's
not safe to take with the per-CPU state lock -- but we still need to
prevent reentrant access to the per-CPU entropy pool by interrupt
handlers while we're extracting from it. So now the logic for
entering a sample is:

- lock per-CPU state
- entpool_enter
- unlock per-CPU state
- if anything pending on this CPU and it's time to consolidate:
- lock global entropy state
- lock per-CPU state
- transfer
- unlock per-CPU state
- unlock global entropy state


# 1.43 20-Mar-2022 riastradh

entropy(9): Factor out logic to lock and unlock per-CPU state.

No functional change intended.


# 1.42 20-Mar-2022 riastradh

entropy(9): Avoid reentrance to per-CPU state from sleeping on lock.

Changing the global entropy lock from IPL_VM to IPL_SOFTSERIAL meant
it went from being a spin lock, which blocks preemption, to being an
adaptive lock, which might sleep -- and allow other threads to run
concurrently with the softint, even if those threads have softints
blocked with splsoftserial.

This manifested as KASSERT(!ec->ec_locked) triggering in
entropy_consolidate_xc -- presumably entropy_softintr slept on the
global entropy lock while holding the per-CPU state locked with
ec->ec_locked, and then entropy_consolidate_xc ran.

Instead, to protect access to the per-CPU state without taking a
global lock, defer entropy_account_cpu until after ec->ec_locked is
cleared. This way, we never sleep while holding ec->ec_locked, nor
do we incur any contention on shared memory when entering entropy
unless we're about to distribute it. To verify this, sprinkle in
assertions that curlwp->l_ncsw hasn't changed by the time we release
ec->ec_locked.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.41 19-Mar-2022 riastradh

rnd(9): Delete legacy rnd_initial_entropy symbol.

Use entropy_epoch() instead.

XXX kernel ABI change deleting symbol requires bump


# 1.40 18-Mar-2022 riastradh

entropy(9): Count dropped or truncated interrupt samples.


# 1.39 18-Mar-2022 riastradh

entropy(9): Reduce global entropy lock from IPL_VM to IPL_SOFTSERIAL.

This is no longer ever taken in hard interrupt context, so there's no
longer any need to block interrupts while doing crypto operations on
the global entropy pool.


# 1.38 18-Mar-2022 riastradh

entropy(9): Request entropy after the softint is enabled.

Otherwise, there is a window during which interrupts are running, but
the softint is not, so if many interrupts queue (low-entropy) samples
early at boot, they might get dropped on the floor. This could
happen, for instance, with a PCI RNG like ubsec(4) or hifn(4) which
requests entropy and processes it in its own hard interrupt handler.


# 1.37 18-Mar-2022 riastradh

entropy(9): Use the early-entropy path only while cold.

This way, we never take the global entropy lock from interrupt
handlers (no interrupts while cold), so the global entropy lock need
not block interrupts.

There's an annoying ordering issue here: softint_establish doesn't
work until after CPUs have been detected, which happens inside
configure(), which is also what enables interrupts. So we have no
opportunity to softint_establish the entropy softint _before_
interrupts are enabled.

To work around this, we have to put a conditional into the interrupt
path, and go out of our way to process any queued samples after
establishing the softint. If we just made softint_establish work
early, like percpu_create does now, this problem would go away and we
could delete a bit of logic here.

Candidate fix for PR kern/56730.


# 1.36 18-Mar-2022 riastradh

entropy(9): Create per-CPU state earlier.

This will make it possible to use it from interrupts as soon as they
start, which means the global entropy pool lock won't have to block
interrupts.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.35 16-Mar-2022 riastradh

entropy(9): Forbid entropy_extract in hard interrupt context.

With a little additional work, this will let us reduce the global
entropy pool lock so it never blocks interrupts.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.34 04-Mar-2022 andvar

fix few typos in comments for word "because".


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.33 26-Sep-2021 thorpej

entropy_read_filtops is MPSAFE.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.32 26-Sep-2021 thorpej

Change the kqueue filterops::f_isfd field to filterops::f_flags, and
define a flag FILTEROP_ISFD that has the meaning of the prior f_isfd.
Field and flag name aligned with OpenBSD.

This does not constitute a functional or ABI change, as the field location
and size, and the value placed in that field, are the same as the previous
code, but we're bumping __NetBSD_Version__ so 3rd-party module source code
can adapt, as needed.

NetBSD 9.99.89


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.31 21-Sep-2021 christos

don't opencode kauth_cred_get()


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.30 12-Feb-2021 jmcneill

entropy: Only print consolidation warning of AB_DEBUG.

The previous fix for PR kern/55458 changed printf to log(LOG_DEBUG, ...) with
the intent of hiding the message unless 'boot -x'. But this did not actually
suppress the message to console as log(LOG_DEBUG, ...) will print to console
if syslogd is not running yet.

So instead, just check for AB_DEBUG flag directly in boothowto, and only
printf the message if it is set.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


Revision tags: thorpej-futex-base
# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.29 21-Jan-2021 riastradh

entropy: Reduce `no seed from bootloader' message to debug level.

This does not necessarily indicate a problem -- only x86 and arm pass
a seed from the bootloader anyway -- so it makes for an always-on
warning on some platforms, including all rump kernels, which is not
helpful.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


Revision tags: thorpej-futex-base
# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.28 16-Jan-2021 riastradh

entropy: Record number of time and data samples for userland.

This more or less follows the semantics of the RNDGETESTNUM and
RNDGETESTNAME ioctls to restore useful `rndctl -lv' output.

Specifically: We count the number of time or data samples entered
with rnd_add_*. Previously it would count the total number of 32-bit
words in the data, rather than the number of rnd_add_* calls that
enter data, but I think the number of calls makes more sense here.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


Revision tags: thorpej-futex-base
# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.27 13-Jan-2021 riastradh

entropy: Use a separate condvar for rndsource list lock.

Otherwise, two processes both waiting for entropy will dance around
waking each other up (by releasing the rndsource list lock) and going
back to sleep (waiting for entropy).

Witnessed on the armv7 testbed when /etc/security presumably ran
twice over a >day-long test, until the metaphorical plug got pulled:

net/if_ipsec/t_ipsec_natt (509/888): 2 test cases
ipsecif_natt_transport_null: [ 37123.2631856] entropy: pid 1005 (dd) blocking due to lack of entropy
[256.523317s] Failed: atf-check failed; see the output of the test for details
ipsecif_natt_transport_rijndaelcbc: [274.370791s] Failed: atf-check failed; see the output of the test for details
[532.486697s]
...
puffs_lstat_symlink: [ 123442.1606517] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1835067] entropy: pid 1005 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 9499 (dd) blocking due to lack of entropy
[ 123442.1944600] entropy: pid 1005 (dd) blocking due to lack of entropy
...


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


Revision tags: thorpej-futex-base
# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.26 11-Jan-2021 riastradh

entropy: Downgrade consolidation warning from printf to LOG_DEBUG.

Candidate fix for PR kern/55458. This message is rather technical,
and so is unlikely to be acted on by anyone not debugging the kernel
anyway. Most likely, on any system where it is a real problem, there
will be another (less technical) entropy warning anyway.


Revision tags: thorpej-futex-base
# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


# 1.24 29-Sep-2020 gson

branches: 1.24.2;
Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.25 11-Dec-2020 thorpej

Use sel{record,remove}_knote().


Revision tags: thorpej-futex-base
# 1.24 29-Sep-2020 gson

Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.24 29-Sep-2020 gson

Log a message when a process blocks due to a lack of entropy.
Discussed on tech-kern.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.23 14-Aug-2020 riastradh

New system call getrandom() compatible with Linux and others.

Three ways to call:

getrandom(p, n, 0) Blocks at boot until full entropy.
Returns up to n bytes at p; guarantees
up to 256 bytes even if interrupted
after blocking. getrandom(0,0,0)
serves as an entropy barrier: return
only after system has full entropy.

getrandom(p, n, GRND_INSECURE) Never blocks. Guarantees up to 256
bytes even if interrupted. Equivalent
to /dev/urandom. Safe only after
successful getrandom(...,0),
getrandom(...,GRND_RANDOM), or read
from /dev/random.

getrandom(p, n, GRND_RANDOM) May block at any time. Returns up to n
bytes at p, but no guarantees about how
many -- may return as short as 1 byte.
Equivalent to /dev/random. Legacy.
Provided only for source compatibility
with Linux.

Can also use flags|GRND_NONBLOCK to fail with EWOULDBLOCK/EAGAIN
without producing any output instead of blocking.

- The combination GRND_INSECURE|GRND_NONBLOCK is the same as
GRND_INSECURE, since GRND_INSECURE never blocks anyway.

- The combinations GRND_INSECURE|GRND_RANDOM and
GRND_INSECURE|GRND_RANDOM|GRND_NONBLOCK are nonsensical and fail
with EINVAL.

As proposed on tech-userlevel, tech-crypto, tech-security, and
tech-kern, and subsequently adopted by core (minus the getentropy part
of the proposal, because other operating systems and participants in
the discussion couldn't come to an agreement about getentropy and
blocking semantics):

https://mail-index.netbsd.org/tech-userlevel/2020/05/02/msg012333.html


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.22 12-May-2020 riastradh

Don't invoke callbacks of rndsources with collection disabled.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.21 10-May-2020 riastradh

Make rndctl -E/-C reset entropy accounting.

If we don't trust a source, it's unreasonable to trust any entropy it
previously provided, and we don't have any way to undo only the
effects of that source, so just zero our estimate of the entropy in
the pool and start over.

(However, keep the samples already in the pool -- just treat them as
though they had zero entropy and start gathering more.)


# 1.20 10-May-2020 riastradh

Fix comments.


# 1.19 10-May-2020 riastradh

Use a temporary pool to consolidate entropy atomically.

There was a low-probability race with the entropy consolidation
logic: calls to entropy_extract at the same time as consolidation is
happening might witness partial contributions from the CPUs when
needed=256, say 64 bits at a time.

To avoid this, feed everything from the per-CPU pools into a
temporary pool, and then feed the temporary pool into the global pool
under the lock at the same time as we update needed.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.18 09-May-2020 riastradh

Prune dead branch.


# 1.17 08-May-2020 riastradh

Make variable unused outside kern_entropy.c static.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.16 08-May-2020 riastradh

Eliminate curcpu_available() hack.

The entropy subsystem is no longer used before curcpu() and curlwp
are available on x86.


# 1.15 08-May-2020 riastradh

Make curcpu_available() always true.

This should work now that x86 runs cpu_init_rng just after curcpu()
and curlwp are initialized, and no other architecture needs it to
work that early.


# 1.14 07-May-2020 riastradh

Print `entropy: ready' only when we first have full entropy.

Now that we consolidate entropy in rndctl -L and equivalent, not just
when the operator chooses, epoch != -1 no longer necessarily means
full entropy -- it just means `time to (re)seed, whether justified by
entropy accounting or by explicit consolidation'.

There is a bug on x86 systems with RDRAND/RDSEED that prevents this
message from appearing at all: it happens so early that consinit has
not run yet, so it just goes into oblivion. Need to fix that some
other way!


# 1.13 07-May-2020 riastradh

Consolidate entropy on RNDADDDATA and writes to /dev/random.

The man page for some time has advertised:

Writing to either /dev/random or /dev/urandom influences subsequent
output of both devices, guaranteed to take effect at next open.

So let's make that true again.

It is a conscious choice _not_ to consolidate entropy frequently.
For example, if you have a _slow_ HWRNG, which provides 32 bits of
entropy every few seconds, and you reveal a hash that to the
adversary before any more comes in, the adversary can in principle
just keep guessing the intermediate state by a brute force search
over ~2^32 possibilities.

To mitigate this, the kernel generally tries to avoid consolidating
entropy from the per-CPU pools until doing so would bring us from
zero entropy to full entropy.

However, there are various _possible_ sources of entropy which are
just hard to give honest estimates for that are valid on ~all
machines -- like interrupt timings. The time at which we read a seed
in, which usually happens via /etc/rc.d/random_seed early in
userland, is a reasonable time to gather this up. An operator or
system engineer who knows another opportune moment can always issue
`sysctl -w kern.entropy.consolidate=1'.

Prompted by a suggestion from nia@ to consolidate entropy at the
first transition to userland. I chose not to do that because it
would likely cause warning fatigue on systems that are perfectly fine
with a random seed -- doing it this way instead lets rndctl -L
trigger the consolidation automatically. A subsequent commit will
reorder the operations in rndctl again to make it work out better.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.12 07-May-2020 riastradh

Fix two mistakes in entropy accounting.

1. When consolidating entropy from per-CPU pools, drop the amount
pending to zero; otherwise the entropy consolidation thread might
keep consolidating every second.

This uncovered a feedback loop with kern.entropy.depletion=1 and
on-demand entropy sources, which is that depleting the pool and then
requesting more from it causes the on-demand entropy sources to
trigger reseed, which causes cprng_fast/strong to request more which
depletes the pool again which causes on-demand entropy sources to
trigger reseed, and so on.

To work around this:

2. Set a rate limit on reseeding (advancing the entropy epoch) when
kern.entropy.depletion=1; otherwise reseeding gets into a feedback
loop when there are on-demand entropy sources like RDRAND/RDSEED.

(By default, kern.entropy.depletion=0, so this mainly only affects
systems where you're simulating what happens when /dev/random blocks
for testing.)


# 1.11 06-May-2020 riastradh

Don't reject seed file entropy estimates, until one is nonzero.

We try to avoid counting the seed file's entropy twice, e.g. once
from the boot loader and once from rndctl via /etc/rc.d/random_seed.

But previously, if you had a /var/db/entropy-file that was deemed to
have zero entropy, that would prevent rndctl -L from _ever_ setting a
nonzero entropy estimate, even if you (say) copy a seed file over
from another machine (over a non-eavesdroppable medium) and try to
load it in with rndctl -L, e.g. via `/etc/rc.d/random_seed start'.

Now we accept the first _nonzero_ entropy estimate from a seed file.

The operator can still always trick the kernel into believing there's
entropy in the system by writing data to /dev/random, if the operator
knows something the kernel doesn't; this only affects the _automated_
seed file loading.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.10 05-May-2020 riastradh

New sysctl kern.entropy.gather=1 to trigger entropy gathering.

Invokes all on-demand RNG sources. This enables HWRNG driver
developers to use a dtrace probe on rnd_add_data to examine the data
coming out of the HWRNG:

dtrace -n 'fbt::rnd_add_data:entry /args[0]->name == "amdccp0"/ {
...examine buffer args[1] length args[2]...
}'


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.9 03-May-2020 riastradh

Initialize struct krndsource::total to zero.

Avoids bogus counts reported by `rndctl -l' in the event that the
caller neglected to zero the rndsource ahead of time.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.8 01-May-2020 riastradh

Fix sense of conditional in previous.

I must have tested (cold ? (void *)1 : curlwp) but then decided,
after testing, to replace cold by !curcpu_available() -- thinking
that would be a safe change to make, except I forgot to either write
the ! or change the sense of the conditional. OOPS.


# 1.7 30-Apr-2020 riastradh

Mark rnd_sources_locked __diagused -- only for KASSERTs.


# 1.6 30-Apr-2020 riastradh

curlwp may not be available early enough for kern_entropy.c.

Fortunately, we're just using it to print helpful diagnostic messages
in kasserts here, so while we're still cold just use (void *)1 for
now until someone figures out how to make curlwp available earlier on
x86.

(All of the curcpu_available() business is a provisional crock here
and it would be better to get rid of it.)


# 1.5 30-Apr-2020 riastradh

Missed a spot! (Part II(b) of no percpu_foreach under spin lock.)


# 1.4 30-Apr-2020 riastradh

Lock the rndsource list without E->lock for ioctls too.

Use the same mechanism as entropy_request, with a little more
diagnostic information in case anything goes wrong. No need for
LIST_FOREACH_SAFE; elements cannot be deleted while the list is
locked.

This is part II of avoiding percpu_foreach with spin lock held.


# 1.3 30-Apr-2020 riastradh

Avoid calling entropy_pending() with E->lock held.

This is part I of avoiding percpu_foreach with spin lock held.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.2 30-Apr-2020 riastradh

Accept both byte orders for random seed in the kernel.

The file format was defined with a machine-dependent 32-bit integer
field (the estimated number of bits of entropy in the process that
generated it). Fortunately we have a checksum to verify the order.

This way you can use `rndctl -S' on a little-endian machine to
generate a seed when installing NetBSD on a big-endian machine, and
the kernel will accept it on boot.


# 1.1 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.