History log of /openbsd-current/sys/sys/systm.h
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.171 28-May-2024 jsg

remove maxmem extern, var removed from all archs long ago


Revision tags: OPENBSD_7_5_BASE
# 1.170 30-Oct-2023 claudio

Adjust KERNEL_ASSERT_UNLOCKED() to not assert during a panic.

KERNEL_ASSERT_UNLOCKED calls _kernel_lock_held() which returns true
if panicstr || db_active which triggers this assert. Workaround this by
checking them before.

This will alter the following Syzkaller reports:
Reported-by: syzbot+169110a0815838ab5940@syzkaller.appspotmail.com
Reported-by: syzbot+3c2eced405b9de6f79c2@syzkaller.appspotmail.com

OK mpi@


# 1.169 17-Oct-2023 cheloha

clockintr: move callback-specific API behaviors to "clockrequest" namespace

The API's behavior when invoked from a callback function is impossible
to document. Move the special behavior into a distinct namespace,
"clockrequest".

- Add a 'struct clockrequest'. Basically a stripped-down 'struct clockintr'
for exclusive use during clockintr_dispatch().
- In clockintr_queue, replace the "cq_shadow" clockintr with a "cq_request"
clockrequest. They serve the same purpose.
- CLST_SHADOW_PENDING -> CR_RESCHEDULE; different namespace, same meaning.
- CLST_IGNORE_SHADOW -> CLST_IGNORE_REQUEST; same meaning.
- Move shadow branch in clockintr_advance() to clockrequest_advance().
- clockintr_request_random() becomes clockrequest_advance_random().
- Delete dead shadow branches in clockintr_cancel(), clockintr_schedule().
- Callback functions now get a clockrequest pointer instead of a special
clockintr pointer: update all prototypes, callers.

No functional change intended.


# 1.168 11-Oct-2023 cheloha

kernel: expand fixed clock interrupt periods to 64-bit values

Technically, all the current fixed clock interrupt periods fit within
an unsigned 32-bit value. But 32-bit multiplication is an accident
waiting to happen. So, expand the fixed periods for hardclock,
statclock, profclock, and roundrobin to 64-bit values.

One exception: statclock_mask remains 32-bit because random(9) yields
32-bit values. Update the initclocks() comment to make it clear that
this is not an accident.


Revision tags: OPENBSD_7_4_BASE
# 1.167 14-Sep-2023 cheloha

clockintr, statclock: eliminate clockintr_statclock() wrapper

- Move remaining statclock variables from kern_clockintr.c to
kern_clock.c. Move statclock variable initialization from
clockintr_init() into initclocks().

- Change statclock() prototype to make it a legal clockintr
callback function and establish the handle with statclock()
instead clockintr_statclock().

- Merge the contents of clockintr_statclock() into statclock().
statclock() can now reschedule itself and handles multiple
expirations transparently.

- Make statclock_avg visible from sys/systm.h so that clockintr_cpu_init()
can use it to advance the statclock across suspend/hibernate.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2


# 1.166 14-Sep-2023 cheloha

clockintr: replace CL_RNDSTAT with global variable statclock_is_randomized

In order to separate the statclock from the clock interrupt subsystem
we need to move all statclock state out into the broader kernel.

Start by replacing the CL_RNDSTAT flag with a new global variable,
"statclock_is_randomized", in kern_clock.c. Update all clockintr_init()
callers to set the boolean instead of passing the flag.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2


# 1.165 23-Aug-2023 cheloha

all platforms: separate cpu_initclocks() from cpu_startclock()

To give the primary CPU an opportunity to perform clock interrupt
preparation in a machine-independent manner we need to separate the
"initialization" parts of cpu_initclocks() from the "start the clock
interrupt" parts. Currently, cpu_initclocks() does everything all at
once, so there is no space for this MI setup.

Many platforms have more-or-less already done this separation by
implementing a separate routine named "cpu_startclock()". This patch
promotes cpu_startclock() from de facto standard to mandatory API.

- Prototype cpu_startclock() in sys/systm.h alongside cpu_initclocks().
The separation of responsibility between the two routines is a bit
fuzzy but the basic guidelines are as follows:

+ cpu_initclocks() must initialize hz, stathz, and profhz, and call
clockintr_init().

+ cpu_startclock() must call clockintr_cpu_init() and start the clock
interrupt cycle on the calling CPU.

These guidelines will shift in the future, but that's the way things
stand as of *this* commit.

- In initclocks(): first call cpu_initclocks(), then do MI setup, and
last call cpu_startclock().

- On platforms where cpu_startclock() already exists: don't call
cpu_startclock() from cpu_initclocks() anymore.

- On platforms where cpu_startclock() doesn't yet exist: implement it.
Usually this is as simple as dividing cpu_initclocks() in two.

Tested on amd64 (i8254, lapic), arm64, i386 (i8254, lapic), macppc,
mips64/octeon, and sparc64. Tested on arm/armv7 (agtimer(4)) by
phessler@ and jmatthew@. Tested on m88k/luna88k by aoyama@. Tested
on powerpc64 by gkoehler@ and mlarkin@. Tested on riscv64 by
jmatthew@.

Thread: https://marc.info/?l=openbsd-tech&m=169195251322149&w=2


# 1.164 05-Aug-2023 cheloha

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interrupt
routine, itimer_update(). itimer_update() is periodic and runs at the
same frequency as the hardclock.

+ Revise itimerdecr() to run within itimer_mtx instead of entering
and leaving it.

- Each schedstate_percpu has its own itimer_update() handle, spc_itimer.
A new scheduler flag, SPCF_ITIMER, indicates whether spc_itimer was
started during the last mi_switch() and needs to be stopped during the
next mi_switch() or sched_exit().

- A new per-process flag, PS_ITIMER, indicates whether ITIMER_VIRTUAL
and/or ITIMER_PROF are running. Checking the flag is easier than
entering itimer_mtx to check process.ps_timer[]. The flag is set
and cleared in a new helper function, process_reset_itimer_flag().

- In setitimer(), call need_resched() when the state of ITIMER_VIRTUAL
or ITIMER_PROF is changed to force an mi_switch() and update
spc_itimer.

claudio@ notes that ITIMER_PROF could be implemented as a high-res
timer using the thread's execution time as a guide for when to
interrupt the process and assert SIGPROF. This would probably work
really well in single-threaded processes. ITIMER_VIRTUAL would be
more difficult to make high-res, though, as you need to exclude time
spent in the kernel.

Tested on powerpc64 by gkoehler@. With input from claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169038818517101&w=2

ok claudio@


# 1.163 14-Jul-2023 claudio

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@


# 1.162 28-Jun-2023 claudio

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.

The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.

OK kettenis@


Revision tags: OPENBSD_7_3_BASE
# 1.161 31-Jan-2023 deraadt

On systems without xonly mmu hardware-enforcement, we can still mitigate
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.

write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.

On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable

ok kettenis, additional help from miod


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.170 30-Oct-2023 claudio

Adjust KERNEL_ASSERT_UNLOCKED() to not assert during a panic.

KERNEL_ASSERT_UNLOCKED calls _kernel_lock_held() which returns true
if panicstr || db_active which triggers this assert. Workaround this by
checking them before.

This will alter the following Syzkaller reports:
Reported-by: syzbot+169110a0815838ab5940@syzkaller.appspotmail.com
Reported-by: syzbot+3c2eced405b9de6f79c2@syzkaller.appspotmail.com

OK mpi@


# 1.169 17-Oct-2023 cheloha

clockintr: move callback-specific API behaviors to "clockrequest" namespace

The API's behavior when invoked from a callback function is impossible
to document. Move the special behavior into a distinct namespace,
"clockrequest".

- Add a 'struct clockrequest'. Basically a stripped-down 'struct clockintr'
for exclusive use during clockintr_dispatch().
- In clockintr_queue, replace the "cq_shadow" clockintr with a "cq_request"
clockrequest. They serve the same purpose.
- CLST_SHADOW_PENDING -> CR_RESCHEDULE; different namespace, same meaning.
- CLST_IGNORE_SHADOW -> CLST_IGNORE_REQUEST; same meaning.
- Move shadow branch in clockintr_advance() to clockrequest_advance().
- clockintr_request_random() becomes clockrequest_advance_random().
- Delete dead shadow branches in clockintr_cancel(), clockintr_schedule().
- Callback functions now get a clockrequest pointer instead of a special
clockintr pointer: update all prototypes, callers.

No functional change intended.


# 1.168 11-Oct-2023 cheloha

kernel: expand fixed clock interrupt periods to 64-bit values

Technically, all the current fixed clock interrupt periods fit within
an unsigned 32-bit value. But 32-bit multiplication is an accident
waiting to happen. So, expand the fixed periods for hardclock,
statclock, profclock, and roundrobin to 64-bit values.

One exception: statclock_mask remains 32-bit because random(9) yields
32-bit values. Update the initclocks() comment to make it clear that
this is not an accident.


Revision tags: OPENBSD_7_4_BASE
# 1.167 14-Sep-2023 cheloha

clockintr, statclock: eliminate clockintr_statclock() wrapper

- Move remaining statclock variables from kern_clockintr.c to
kern_clock.c. Move statclock variable initialization from
clockintr_init() into initclocks().

- Change statclock() prototype to make it a legal clockintr
callback function and establish the handle with statclock()
instead clockintr_statclock().

- Merge the contents of clockintr_statclock() into statclock().
statclock() can now reschedule itself and handles multiple
expirations transparently.

- Make statclock_avg visible from sys/systm.h so that clockintr_cpu_init()
can use it to advance the statclock across suspend/hibernate.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2


# 1.166 14-Sep-2023 cheloha

clockintr: replace CL_RNDSTAT with global variable statclock_is_randomized

In order to separate the statclock from the clock interrupt subsystem
we need to move all statclock state out into the broader kernel.

Start by replacing the CL_RNDSTAT flag with a new global variable,
"statclock_is_randomized", in kern_clock.c. Update all clockintr_init()
callers to set the boolean instead of passing the flag.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2


# 1.165 23-Aug-2023 cheloha

all platforms: separate cpu_initclocks() from cpu_startclock()

To give the primary CPU an opportunity to perform clock interrupt
preparation in a machine-independent manner we need to separate the
"initialization" parts of cpu_initclocks() from the "start the clock
interrupt" parts. Currently, cpu_initclocks() does everything all at
once, so there is no space for this MI setup.

Many platforms have more-or-less already done this separation by
implementing a separate routine named "cpu_startclock()". This patch
promotes cpu_startclock() from de facto standard to mandatory API.

- Prototype cpu_startclock() in sys/systm.h alongside cpu_initclocks().
The separation of responsibility between the two routines is a bit
fuzzy but the basic guidelines are as follows:

+ cpu_initclocks() must initialize hz, stathz, and profhz, and call
clockintr_init().

+ cpu_startclock() must call clockintr_cpu_init() and start the clock
interrupt cycle on the calling CPU.

These guidelines will shift in the future, but that's the way things
stand as of *this* commit.

- In initclocks(): first call cpu_initclocks(), then do MI setup, and
last call cpu_startclock().

- On platforms where cpu_startclock() already exists: don't call
cpu_startclock() from cpu_initclocks() anymore.

- On platforms where cpu_startclock() doesn't yet exist: implement it.
Usually this is as simple as dividing cpu_initclocks() in two.

Tested on amd64 (i8254, lapic), arm64, i386 (i8254, lapic), macppc,
mips64/octeon, and sparc64. Tested on arm/armv7 (agtimer(4)) by
phessler@ and jmatthew@. Tested on m88k/luna88k by aoyama@. Tested
on powerpc64 by gkoehler@ and mlarkin@. Tested on riscv64 by
jmatthew@.

Thread: https://marc.info/?l=openbsd-tech&m=169195251322149&w=2


# 1.164 05-Aug-2023 cheloha

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interrupt
routine, itimer_update(). itimer_update() is periodic and runs at the
same frequency as the hardclock.

+ Revise itimerdecr() to run within itimer_mtx instead of entering
and leaving it.

- Each schedstate_percpu has its own itimer_update() handle, spc_itimer.
A new scheduler flag, SPCF_ITIMER, indicates whether spc_itimer was
started during the last mi_switch() and needs to be stopped during the
next mi_switch() or sched_exit().

- A new per-process flag, PS_ITIMER, indicates whether ITIMER_VIRTUAL
and/or ITIMER_PROF are running. Checking the flag is easier than
entering itimer_mtx to check process.ps_timer[]. The flag is set
and cleared in a new helper function, process_reset_itimer_flag().

- In setitimer(), call need_resched() when the state of ITIMER_VIRTUAL
or ITIMER_PROF is changed to force an mi_switch() and update
spc_itimer.

claudio@ notes that ITIMER_PROF could be implemented as a high-res
timer using the thread's execution time as a guide for when to
interrupt the process and assert SIGPROF. This would probably work
really well in single-threaded processes. ITIMER_VIRTUAL would be
more difficult to make high-res, though, as you need to exclude time
spent in the kernel.

Tested on powerpc64 by gkoehler@. With input from claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169038818517101&w=2

ok claudio@


# 1.163 14-Jul-2023 claudio

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@


# 1.162 28-Jun-2023 claudio

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.

The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.

OK kettenis@


Revision tags: OPENBSD_7_3_BASE
# 1.161 31-Jan-2023 deraadt

On systems without xonly mmu hardware-enforcement, we can still mitigate
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.

write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.

On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable

ok kettenis, additional help from miod


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.169 17-Oct-2023 cheloha

clockintr: move callback-specific API behaviors to "clockrequest" namespace

The API's behavior when invoked from a callback function is impossible
to document. Move the special behavior into a distinct namespace,
"clockrequest".

- Add a 'struct clockrequest'. Basically a stripped-down 'struct clockintr'
for exclusive use during clockintr_dispatch().
- In clockintr_queue, replace the "cq_shadow" clockintr with a "cq_request"
clockrequest. They serve the same purpose.
- CLST_SHADOW_PENDING -> CR_RESCHEDULE; different namespace, same meaning.
- CLST_IGNORE_SHADOW -> CLST_IGNORE_REQUEST; same meaning.
- Move shadow branch in clockintr_advance() to clockrequest_advance().
- clockintr_request_random() becomes clockrequest_advance_random().
- Delete dead shadow branches in clockintr_cancel(), clockintr_schedule().
- Callback functions now get a clockrequest pointer instead of a special
clockintr pointer: update all prototypes, callers.

No functional change intended.


# 1.168 11-Oct-2023 cheloha

kernel: expand fixed clock interrupt periods to 64-bit values

Technically, all the current fixed clock interrupt periods fit within
an unsigned 32-bit value. But 32-bit multiplication is an accident
waiting to happen. So, expand the fixed periods for hardclock,
statclock, profclock, and roundrobin to 64-bit values.

One exception: statclock_mask remains 32-bit because random(9) yields
32-bit values. Update the initclocks() comment to make it clear that
this is not an accident.


Revision tags: OPENBSD_7_4_BASE
# 1.167 14-Sep-2023 cheloha

clockintr, statclock: eliminate clockintr_statclock() wrapper

- Move remaining statclock variables from kern_clockintr.c to
kern_clock.c. Move statclock variable initialization from
clockintr_init() into initclocks().

- Change statclock() prototype to make it a legal clockintr
callback function and establish the handle with statclock()
instead clockintr_statclock().

- Merge the contents of clockintr_statclock() into statclock().
statclock() can now reschedule itself and handles multiple
expirations transparently.

- Make statclock_avg visible from sys/systm.h so that clockintr_cpu_init()
can use it to advance the statclock across suspend/hibernate.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2


# 1.166 14-Sep-2023 cheloha

clockintr: replace CL_RNDSTAT with global variable statclock_is_randomized

In order to separate the statclock from the clock interrupt subsystem
we need to move all statclock state out into the broader kernel.

Start by replacing the CL_RNDSTAT flag with a new global variable,
"statclock_is_randomized", in kern_clock.c. Update all clockintr_init()
callers to set the boolean instead of passing the flag.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2


# 1.165 23-Aug-2023 cheloha

all platforms: separate cpu_initclocks() from cpu_startclock()

To give the primary CPU an opportunity to perform clock interrupt
preparation in a machine-independent manner we need to separate the
"initialization" parts of cpu_initclocks() from the "start the clock
interrupt" parts. Currently, cpu_initclocks() does everything all at
once, so there is no space for this MI setup.

Many platforms have more-or-less already done this separation by
implementing a separate routine named "cpu_startclock()". This patch
promotes cpu_startclock() from de facto standard to mandatory API.

- Prototype cpu_startclock() in sys/systm.h alongside cpu_initclocks().
The separation of responsibility between the two routines is a bit
fuzzy but the basic guidelines are as follows:

+ cpu_initclocks() must initialize hz, stathz, and profhz, and call
clockintr_init().

+ cpu_startclock() must call clockintr_cpu_init() and start the clock
interrupt cycle on the calling CPU.

These guidelines will shift in the future, but that's the way things
stand as of *this* commit.

- In initclocks(): first call cpu_initclocks(), then do MI setup, and
last call cpu_startclock().

- On platforms where cpu_startclock() already exists: don't call
cpu_startclock() from cpu_initclocks() anymore.

- On platforms where cpu_startclock() doesn't yet exist: implement it.
Usually this is as simple as dividing cpu_initclocks() in two.

Tested on amd64 (i8254, lapic), arm64, i386 (i8254, lapic), macppc,
mips64/octeon, and sparc64. Tested on arm/armv7 (agtimer(4)) by
phessler@ and jmatthew@. Tested on m88k/luna88k by aoyama@. Tested
on powerpc64 by gkoehler@ and mlarkin@. Tested on riscv64 by
jmatthew@.

Thread: https://marc.info/?l=openbsd-tech&m=169195251322149&w=2


# 1.164 05-Aug-2023 cheloha

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interrupt
routine, itimer_update(). itimer_update() is periodic and runs at the
same frequency as the hardclock.

+ Revise itimerdecr() to run within itimer_mtx instead of entering
and leaving it.

- Each schedstate_percpu has its own itimer_update() handle, spc_itimer.
A new scheduler flag, SPCF_ITIMER, indicates whether spc_itimer was
started during the last mi_switch() and needs to be stopped during the
next mi_switch() or sched_exit().

- A new per-process flag, PS_ITIMER, indicates whether ITIMER_VIRTUAL
and/or ITIMER_PROF are running. Checking the flag is easier than
entering itimer_mtx to check process.ps_timer[]. The flag is set
and cleared in a new helper function, process_reset_itimer_flag().

- In setitimer(), call need_resched() when the state of ITIMER_VIRTUAL
or ITIMER_PROF is changed to force an mi_switch() and update
spc_itimer.

claudio@ notes that ITIMER_PROF could be implemented as a high-res
timer using the thread's execution time as a guide for when to
interrupt the process and assert SIGPROF. This would probably work
really well in single-threaded processes. ITIMER_VIRTUAL would be
more difficult to make high-res, though, as you need to exclude time
spent in the kernel.

Tested on powerpc64 by gkoehler@. With input from claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169038818517101&w=2

ok claudio@


# 1.163 14-Jul-2023 claudio

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@


# 1.162 28-Jun-2023 claudio

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.

The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.

OK kettenis@


Revision tags: OPENBSD_7_3_BASE
# 1.161 31-Jan-2023 deraadt

On systems without xonly mmu hardware-enforcement, we can still mitigate
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.

write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.

On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable

ok kettenis, additional help from miod


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.168 11-Oct-2023 cheloha

kernel: expand fixed clock interrupt periods to 64-bit values

Technically, all the current fixed clock interrupt periods fit within
an unsigned 32-bit value. But 32-bit multiplication is an accident
waiting to happen. So, expand the fixed periods for hardclock,
statclock, profclock, and roundrobin to 64-bit values.

One exception: statclock_mask remains 32-bit because random(9) yields
32-bit values. Update the initclocks() comment to make it clear that
this is not an accident.


Revision tags: OPENBSD_7_4_BASE
# 1.167 14-Sep-2023 cheloha

clockintr, statclock: eliminate clockintr_statclock() wrapper

- Move remaining statclock variables from kern_clockintr.c to
kern_clock.c. Move statclock variable initialization from
clockintr_init() into initclocks().

- Change statclock() prototype to make it a legal clockintr
callback function and establish the handle with statclock()
instead clockintr_statclock().

- Merge the contents of clockintr_statclock() into statclock().
statclock() can now reschedule itself and handles multiple
expirations transparently.

- Make statclock_avg visible from sys/systm.h so that clockintr_cpu_init()
can use it to advance the statclock across suspend/hibernate.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2


# 1.166 14-Sep-2023 cheloha

clockintr: replace CL_RNDSTAT with global variable statclock_is_randomized

In order to separate the statclock from the clock interrupt subsystem
we need to move all statclock state out into the broader kernel.

Start by replacing the CL_RNDSTAT flag with a new global variable,
"statclock_is_randomized", in kern_clock.c. Update all clockintr_init()
callers to set the boolean instead of passing the flag.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2


# 1.165 23-Aug-2023 cheloha

all platforms: separate cpu_initclocks() from cpu_startclock()

To give the primary CPU an opportunity to perform clock interrupt
preparation in a machine-independent manner we need to separate the
"initialization" parts of cpu_initclocks() from the "start the clock
interrupt" parts. Currently, cpu_initclocks() does everything all at
once, so there is no space for this MI setup.

Many platforms have more-or-less already done this separation by
implementing a separate routine named "cpu_startclock()". This patch
promotes cpu_startclock() from de facto standard to mandatory API.

- Prototype cpu_startclock() in sys/systm.h alongside cpu_initclocks().
The separation of responsibility between the two routines is a bit
fuzzy but the basic guidelines are as follows:

+ cpu_initclocks() must initialize hz, stathz, and profhz, and call
clockintr_init().

+ cpu_startclock() must call clockintr_cpu_init() and start the clock
interrupt cycle on the calling CPU.

These guidelines will shift in the future, but that's the way things
stand as of *this* commit.

- In initclocks(): first call cpu_initclocks(), then do MI setup, and
last call cpu_startclock().

- On platforms where cpu_startclock() already exists: don't call
cpu_startclock() from cpu_initclocks() anymore.

- On platforms where cpu_startclock() doesn't yet exist: implement it.
Usually this is as simple as dividing cpu_initclocks() in two.

Tested on amd64 (i8254, lapic), arm64, i386 (i8254, lapic), macppc,
mips64/octeon, and sparc64. Tested on arm/armv7 (agtimer(4)) by
phessler@ and jmatthew@. Tested on m88k/luna88k by aoyama@. Tested
on powerpc64 by gkoehler@ and mlarkin@. Tested on riscv64 by
jmatthew@.

Thread: https://marc.info/?l=openbsd-tech&m=169195251322149&w=2


# 1.164 05-Aug-2023 cheloha

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interrupt
routine, itimer_update(). itimer_update() is periodic and runs at the
same frequency as the hardclock.

+ Revise itimerdecr() to run within itimer_mtx instead of entering
and leaving it.

- Each schedstate_percpu has its own itimer_update() handle, spc_itimer.
A new scheduler flag, SPCF_ITIMER, indicates whether spc_itimer was
started during the last mi_switch() and needs to be stopped during the
next mi_switch() or sched_exit().

- A new per-process flag, PS_ITIMER, indicates whether ITIMER_VIRTUAL
and/or ITIMER_PROF are running. Checking the flag is easier than
entering itimer_mtx to check process.ps_timer[]. The flag is set
and cleared in a new helper function, process_reset_itimer_flag().

- In setitimer(), call need_resched() when the state of ITIMER_VIRTUAL
or ITIMER_PROF is changed to force an mi_switch() and update
spc_itimer.

claudio@ notes that ITIMER_PROF could be implemented as a high-res
timer using the thread's execution time as a guide for when to
interrupt the process and assert SIGPROF. This would probably work
really well in single-threaded processes. ITIMER_VIRTUAL would be
more difficult to make high-res, though, as you need to exclude time
spent in the kernel.

Tested on powerpc64 by gkoehler@. With input from claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169038818517101&w=2

ok claudio@


# 1.163 14-Jul-2023 claudio

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@


# 1.162 28-Jun-2023 claudio

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.

The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.

OK kettenis@


Revision tags: OPENBSD_7_3_BASE
# 1.161 31-Jan-2023 deraadt

On systems without xonly mmu hardware-enforcement, we can still mitigate
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.

write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.

On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable

ok kettenis, additional help from miod


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.167 14-Sep-2023 cheloha

clockintr, statclock: eliminate clockintr_statclock() wrapper

- Move remaining statclock variables from kern_clockintr.c to
kern_clock.c. Move statclock variable initialization from
clockintr_init() into initclocks().

- Change statclock() prototype to make it a legal clockintr
callback function and establish the handle with statclock()
instead clockintr_statclock().

- Merge the contents of clockintr_statclock() into statclock().
statclock() can now reschedule itself and handles multiple
expirations transparently.

- Make statclock_avg visible from sys/systm.h so that clockintr_cpu_init()
can use it to advance the statclock across suspend/hibernate.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2


# 1.166 14-Sep-2023 cheloha

clockintr: replace CL_RNDSTAT with global variable statclock_is_randomized

In order to separate the statclock from the clock interrupt subsystem
we need to move all statclock state out into the broader kernel.

Start by replacing the CL_RNDSTAT flag with a new global variable,
"statclock_is_randomized", in kern_clock.c. Update all clockintr_init()
callers to set the boolean instead of passing the flag.

Thread: https://marc.info/?l=openbsd-tech&m=169428749720476&w=2


# 1.165 23-Aug-2023 cheloha

all platforms: separate cpu_initclocks() from cpu_startclock()

To give the primary CPU an opportunity to perform clock interrupt
preparation in a machine-independent manner we need to separate the
"initialization" parts of cpu_initclocks() from the "start the clock
interrupt" parts. Currently, cpu_initclocks() does everything all at
once, so there is no space for this MI setup.

Many platforms have more-or-less already done this separation by
implementing a separate routine named "cpu_startclock()". This patch
promotes cpu_startclock() from de facto standard to mandatory API.

- Prototype cpu_startclock() in sys/systm.h alongside cpu_initclocks().
The separation of responsibility between the two routines is a bit
fuzzy but the basic guidelines are as follows:

+ cpu_initclocks() must initialize hz, stathz, and profhz, and call
clockintr_init().

+ cpu_startclock() must call clockintr_cpu_init() and start the clock
interrupt cycle on the calling CPU.

These guidelines will shift in the future, but that's the way things
stand as of *this* commit.

- In initclocks(): first call cpu_initclocks(), then do MI setup, and
last call cpu_startclock().

- On platforms where cpu_startclock() already exists: don't call
cpu_startclock() from cpu_initclocks() anymore.

- On platforms where cpu_startclock() doesn't yet exist: implement it.
Usually this is as simple as dividing cpu_initclocks() in two.

Tested on amd64 (i8254, lapic), arm64, i386 (i8254, lapic), macppc,
mips64/octeon, and sparc64. Tested on arm/armv7 (agtimer(4)) by
phessler@ and jmatthew@. Tested on m88k/luna88k by aoyama@. Tested
on powerpc64 by gkoehler@ and mlarkin@. Tested on riscv64 by
jmatthew@.

Thread: https://marc.info/?l=openbsd-tech&m=169195251322149&w=2


# 1.164 05-Aug-2023 cheloha

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interrupt
routine, itimer_update(). itimer_update() is periodic and runs at the
same frequency as the hardclock.

+ Revise itimerdecr() to run within itimer_mtx instead of entering
and leaving it.

- Each schedstate_percpu has its own itimer_update() handle, spc_itimer.
A new scheduler flag, SPCF_ITIMER, indicates whether spc_itimer was
started during the last mi_switch() and needs to be stopped during the
next mi_switch() or sched_exit().

- A new per-process flag, PS_ITIMER, indicates whether ITIMER_VIRTUAL
and/or ITIMER_PROF are running. Checking the flag is easier than
entering itimer_mtx to check process.ps_timer[]. The flag is set
and cleared in a new helper function, process_reset_itimer_flag().

- In setitimer(), call need_resched() when the state of ITIMER_VIRTUAL
or ITIMER_PROF is changed to force an mi_switch() and update
spc_itimer.

claudio@ notes that ITIMER_PROF could be implemented as a high-res
timer using the thread's execution time as a guide for when to
interrupt the process and assert SIGPROF. This would probably work
really well in single-threaded processes. ITIMER_VIRTUAL would be
more difficult to make high-res, though, as you need to exclude time
spent in the kernel.

Tested on powerpc64 by gkoehler@. With input from claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169038818517101&w=2

ok claudio@


# 1.163 14-Jul-2023 claudio

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@


# 1.162 28-Jun-2023 claudio

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.

The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.

OK kettenis@


Revision tags: OPENBSD_7_3_BASE
# 1.161 31-Jan-2023 deraadt

On systems without xonly mmu hardware-enforcement, we can still mitigate
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.

write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.

On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable

ok kettenis, additional help from miod


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.165 23-Aug-2023 cheloha

all platforms: separate cpu_initclocks() from cpu_startclock()

To give the primary CPU an opportunity to perform clock interrupt
preparation in a machine-independent manner we need to separate the
"initialization" parts of cpu_initclocks() from the "start the clock
interrupt" parts. Currently, cpu_initclocks() does everything all at
once, so there is no space for this MI setup.

Many platforms have more-or-less already done this separation by
implementing a separate routine named "cpu_startclock()". This patch
promotes cpu_startclock() from de facto standard to mandatory API.

- Prototype cpu_startclock() in sys/systm.h alongside cpu_initclocks().
The separation of responsibility between the two routines is a bit
fuzzy but the basic guidelines are as follows:

+ cpu_initclocks() must initialize hz, stathz, and profhz, and call
clockintr_init().

+ cpu_startclock() must call clockintr_cpu_init() and start the clock
interrupt cycle on the calling CPU.

These guidelines will shift in the future, but that's the way things
stand as of *this* commit.

- In initclocks(): first call cpu_initclocks(), then do MI setup, and
last call cpu_startclock().

- On platforms where cpu_startclock() already exists: don't call
cpu_startclock() from cpu_initclocks() anymore.

- On platforms where cpu_startclock() doesn't yet exist: implement it.
Usually this is as simple as dividing cpu_initclocks() in two.

Tested on amd64 (i8254, lapic), arm64, i386 (i8254, lapic), macppc,
mips64/octeon, and sparc64. Tested on arm/armv7 (agtimer(4)) by
phessler@ and jmatthew@. Tested on m88k/luna88k by aoyama@. Tested
on powerpc64 by gkoehler@ and mlarkin@. Tested on riscv64 by
jmatthew@.

Thread: https://marc.info/?l=openbsd-tech&m=169195251322149&w=2


# 1.164 05-Aug-2023 cheloha

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interrupt
routine, itimer_update(). itimer_update() is periodic and runs at the
same frequency as the hardclock.

+ Revise itimerdecr() to run within itimer_mtx instead of entering
and leaving it.

- Each schedstate_percpu has its own itimer_update() handle, spc_itimer.
A new scheduler flag, SPCF_ITIMER, indicates whether spc_itimer was
started during the last mi_switch() and needs to be stopped during the
next mi_switch() or sched_exit().

- A new per-process flag, PS_ITIMER, indicates whether ITIMER_VIRTUAL
and/or ITIMER_PROF are running. Checking the flag is easier than
entering itimer_mtx to check process.ps_timer[]. The flag is set
and cleared in a new helper function, process_reset_itimer_flag().

- In setitimer(), call need_resched() when the state of ITIMER_VIRTUAL
or ITIMER_PROF is changed to force an mi_switch() and update
spc_itimer.

claudio@ notes that ITIMER_PROF could be implemented as a high-res
timer using the thread's execution time as a guide for when to
interrupt the process and assert SIGPROF. This would probably work
really well in single-threaded processes. ITIMER_VIRTUAL would be
more difficult to make high-res, though, as you need to exclude time
spent in the kernel.

Tested on powerpc64 by gkoehler@. With input from claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169038818517101&w=2

ok claudio@


# 1.163 14-Jul-2023 claudio

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@


# 1.162 28-Jun-2023 claudio

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.

The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.

OK kettenis@


Revision tags: OPENBSD_7_3_BASE
# 1.161 31-Jan-2023 deraadt

On systems without xonly mmu hardware-enforcement, we can still mitigate
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.

write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.

On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable

ok kettenis, additional help from miod


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.164 05-Aug-2023 cheloha

hardclock(9): move setitimer(2) code into itimer_update()

- Move the setitimer(2) code responsible for updating the ITIMER_VIRTUAL
and ITIMER_PROF timers from hardclock(9) into a new clock interrupt
routine, itimer_update(). itimer_update() is periodic and runs at the
same frequency as the hardclock.

+ Revise itimerdecr() to run within itimer_mtx instead of entering
and leaving it.

- Each schedstate_percpu has its own itimer_update() handle, spc_itimer.
A new scheduler flag, SPCF_ITIMER, indicates whether spc_itimer was
started during the last mi_switch() and needs to be stopped during the
next mi_switch() or sched_exit().

- A new per-process flag, PS_ITIMER, indicates whether ITIMER_VIRTUAL
and/or ITIMER_PROF are running. Checking the flag is easier than
entering itimer_mtx to check process.ps_timer[]. The flag is set
and cleared in a new helper function, process_reset_itimer_flag().

- In setitimer(), call need_resched() when the state of ITIMER_VIRTUAL
or ITIMER_PROF is changed to force an mi_switch() and update
spc_itimer.

claudio@ notes that ITIMER_PROF could be implemented as a high-res
timer using the thread's execution time as a guide for when to
interrupt the process and assert SIGPROF. This would probably work
really well in single-threaded processes. ITIMER_VIRTUAL would be
more difficult to make high-res, though, as you need to exclude time
spent in the kernel.

Tested on powerpc64 by gkoehler@. With input from claudio@.

Thread: https://marc.info/?l=openbsd-tech&m=169038818517101&w=2

ok claudio@


# 1.163 14-Jul-2023 claudio

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@


# 1.162 28-Jun-2023 claudio

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.

The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.

OK kettenis@


Revision tags: OPENBSD_7_3_BASE
# 1.161 31-Jan-2023 deraadt

On systems without xonly mmu hardware-enforcement, we can still mitigate
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.

write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.

On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable

ok kettenis, additional help from miod


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.163 14-Jul-2023 claudio

struct sleep_state is no longer used, remove it.
Also remove the priority argument to sleep_finish() the code can use
the p_flag P_SINTR flag to know if the signal check is needed or not.
OK cheloha@ kettenis@ mpi@


# 1.162 28-Jun-2023 claudio

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.

The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.

OK kettenis@


Revision tags: OPENBSD_7_3_BASE
# 1.161 31-Jan-2023 deraadt

On systems without xonly mmu hardware-enforcement, we can still mitigate
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.

write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.

On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable

ok kettenis, additional help from miod


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.162 28-Jun-2023 claudio

First step at removing struct sleep_state.

Pass the timeout and sleep priority not only to sleep_setup() but also
to sleep_finish(). With that sls_timeout and sls_catch can be removed
from struct sleep_state.

The timeout is now setup first thing in sleep_finish() and no longer as
last thing in sleep_setup(). This should not cause a noticeable difference
since the code run between sleep_setup() and sleep_finish() is minimal.

OK kettenis@


Revision tags: OPENBSD_7_3_BASE
# 1.161 31-Jan-2023 deraadt

On systems without xonly mmu hardware-enforcement, we can still mitigate
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.

write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.

On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable

ok kettenis, additional help from miod


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.161 31-Jan-2023 deraadt

On systems without xonly mmu hardware-enforcement, we can still mitigate
against classic BROP with a range-checking wrapper in front of copyin() and
copyinstr() which ensures the userland source doesn't overlap the main program
text, ld.so text, signal tramp text (it's mapping is hard to distinguish
so it comes along for the ride), or libc.so text. ld.so tells the kernel
libc.so text range with msyscall(2). The range checking for 2-4 elements is
done without locking (because all 4 ranges are immutable!) and is inexpensive.

write(sock, &open, 400) now fails with EFAULT. No programs have been
discovered which require reading their own text segments with a system call.

On a machine without mmu enforcement, a test program reports the following:
userland kernel
ld.so readable unreadable
mmap xz unreadable unreadable
mmap x readable readable
mmap nrx readable readable
mmap nwx readable readable
mmap xnwx readable readable
main readable unreadable
libc unmapped? readable unreadable
libc mapped readable unreadable

ok kettenis, additional help from miod


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.160 06-Jan-2023 miod

Remove copystr(9), unless used internally by copy{in,out}str.


Revision tags: OPENBSD_7_2_BASE
# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.159 03-Sep-2022 kettenis

Allow suspend with root on sdmmc(4).

ok deraadt@


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.158 06-Aug-2022 bluhm

Clean up the netlock macros. Merge NET_RLOCK_IN_SOFTNET and
NET_RLOCK_IN_IOCTL, which have the same implementation. The R and
W are hard to see, call the new macro NET_LOCK_SHARED. Rename the
opposite assertion from NET_ASSERT_WLOCKED to NET_ASSERT_LOCKED_EXCLUSIVE.
Update some outdated comments about net locking.
OK mpi@ mvs@


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.157 12-Jul-2022 jca

Add db_rint(), an MI interface to db_enter() copied from kdbrint() in vax code

If ddb.console is set and your serial console driver uses it, db_rint(),
lets you enter ddb(4) by typing the ESC D escape sequence. This is
useful for drivers like sfuart(4) where the hardware doesn't have a true
BREAK mechanism.

Suggested by miod@, ok kettenis@ miod@


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.156 05-Jul-2022 visa

Remove old poll/select wakeup mechanism.

Also remove unneeded seltrue() and selfalse().

OK mpi@ jsg@


Revision tags: OPENBSD_7_1_BASE
# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.155 09-Dec-2021 guenther

We only have one syscall table: inline sysent/SYS_MAXSYSCALL and
SYS_syscall as the nosys() function into the MD syscall entry
routines and the SYSCALL_DEBUG support. Adjust alpha's syscall
check to match the other archs. Also, make sysent const to get it
into .rodata.

With that, 'struct emul' is unused: delete it and all its references

ok millert@


Revision tags: OPENBSD_7_0_BASE
# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.154 02-Jun-2021 cheloha

kernel: introduce per-CPU panic(9) message buffers

Add a 512-byte buffer (ci_panicbuf) to each cpu_info struct on each
platform for use by panic(9). The first panic on a given CPU writes
its message to this buffer. Subsequent panics on a given CPU print
the panic message to the console but do not modify the buffer. This
aids debugging in two cases:

- If 2+ CPUs panic simultaneously there is no risk of garbled messages
in the panic buffer.

- If a CPU panics and then the operator causes a second panic while
using ddb(4), the operator can still recall the first failure on
a particular CPU.

Misc. changes to support this bigger change:

- Set panicstr atomically to identify the first CPU to reach panic().

- Tweak db_show_panic_cmd() to print all panic messages across all
CPUs. Prefix the first panic with an asterisk ('*').

- Prefer db_printf() to printf() during a panic if we have it.
Apparently it disturbs less global state.

- On amd64, tweak fault() to write the local panic buffer. This needs
more work.

Prompted by bluhm@ and deraadt@. Mostly written by deraadt@.
Discussed with bluhm@, deraadt@ and kettenis@.

Borne from a discussion on tech@ about making panic(9) more MP-safe:

https://marc.info/?l=openbsd-tech&m=162086462316143&w=2

ok kettenis@, visa@, bluhm@, deraadt@


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.153 28-Apr-2021 sashan

time to add NET_ASSERT_WLOCKED()

with moving towards NET_RLOCK...() we need NET_ASSERT_WLOCKED()
to check caller owns netlock exclusively.

OK @bluhm


Revision tags: OPENBSD_6_9_BASE
# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.152 08-Feb-2021 mpi

Simplify sleep_setup API to two operations in preparation for splitting
the SCHED_LOCK().

Putting a thread on a sleep queue is reduce to the following:

sleep_setup();
/* check condition or release lock */
sleep_finish();

Previous version ok cheloha@, jmatthew@, ok claudio@


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.151 01-Feb-2021 visa

Remove obsolete vnode operation vector declarations.

OK bluhm@, claudio@, mpi@, semarie@


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.150 27-Dec-2020 visa

Make NET_LOCK() assertions conditional to DIAGNOSTIC

This saves about 2.5 KiB off amd64's RAMDISK after gzip compression.

OK deraadt@, mpi@, cheloha@


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.149 24-Dec-2020 cheloha

tsleep(9): add global "nowake" channel for threads avoiding wakeup(9)

It would be convenient if there were a channel a thread could sleep on
to indicate they do not want any wakeup(9) broadcasts. The easiest way
to do this is to add an "int nowake" to kern_synch.c and extern it in
sys/systm.h. You use it like this:

#include <sys/systm.h>

tsleep_nsec(&nowait, ...);

There is now no need to handroll a local dead channel, e.g.

int chan;

tsleep_nsec(&chan, ...);

which expands the stack. Local dead channels will be replaced with
&nowake in later patches.

One possible problem with this "one global channel" approach is sleep
queue congestion. If you have lots of threads sleeping on &nowake you
might slow down a wakeup(9) on a different channel that hashes into
the same queue. Unsure how much of problem this actually is, if at all.

NetBSD and FreeBSD have a "pause" interface in the kernel that chooses
a suitable channel automatically. To keep things simple and avoid
adding a new interface we will start with this global channel.

Discussed with mpi@, claudio@, kettenis@, and deraadt@.

Basically designed by kettenis@, who vetoed my other proposals.

Bugs caught by deraadt@, tb@, and patrick@.


Revision tags: OPENBSD_6_8_BASE
# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.148 26-Aug-2020 visa

Declare hw_{prod,serial,uuid,vendor,ver} in <sys/systm.h>.

OK deraadt@, mpi@


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.147 29-May-2020 deraadt

dev/rndvar.h no longer has statistical interfaces (removed during various
conversion steps). it only contains kernel prototypes for 4 interfaces,
all of which legitimately belong in sys/systm.h, which are already included
by all enqueue_randomness() users.


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.146 27-May-2020 mpi

Document the various flavors of NET_LOCK() and rename the reader version.

Since our last concurrency mistake only ioctl(2) ans sysctl(2) code path
take the reader lock. This is mostly for documentation purpose as long as
the softnet thread is converted back to use a read lock.

dlg@ said that comments should be good enough.

ok sashan@


Revision tags: OPENBSD_6_7_BASE
# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.145 20-Mar-2020 cheloha

tsleep_nsec(9): add MAXTSLP macro, the maximum sleep duration

This macro will be useful for truncating durations below INFSLP
(UINT64_MAX) when converting from a timespec or timeval to a count
of nanoseconds before calling tsleep_nsec(9), msleep_nsec(9), or
rwsleep_nsec(9).

A relative timespec can hold many more nanoseconds than a uint64_t
can. TIMESPEC_TO_NSEC() and TIMEVAL_TO_NSEC() check for overflow,
returning UINT64_MAX if the conversion would overflow a uint64_t.

Thus, MAXTSLP will make it easy to avoid inadvertently passing INFSLP
to tsleep_nsec(9) et al. when the caller intended to set a timeout.

The code in such a case might look something like this:

uint64_t nsecs = MIN(TIMESPEC_TO_NSEC(&ts), MAXTSLP);

The macro may also be useful for rejecting intervals that are "too large",
e.g. for sockets with timeouts, if the timeout duration is to be stored
as a uint64_t in an object in the kernel. The code in such a case might
look something like this:

case SIOCTIMEOUT:
{
struct timeval *tv = (struct timeval *)data;
uint64_t nsecs;

if (tv->tv_sec < 0 || !timerisvalid(tv))
return EINVAL;

nsecs = TIMEVAL_TO_NSEC(tv);
if (nsecs > MAXTSLP)
return EOVERFLOW;

obj.timeout = nsecs;
break;
}

Idea suggested by visa@.

ok visa@


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.144 30-Nov-2019 visa

Move kernel locking inside the sleep machinery. This enables calling
rwsleep(9) with PCATCH and rw_enter(9) with RW_INTR without the kernel
lock. In addition, now tsleep(9) with PCATCH should be safe to use
without the kernel lock if the sleep is purely time-based.

Tested by anton@, cheloha@, chris@
OK anton@, cheloha@


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.143 02-Nov-2019 cheloha

softclock: move softintr registration/scheduling into timeout module

softclock() is scheduled from hardclock(9) because long ago callouts were
processed from hardclock(9) directly. The introduction of timeout(9) circa
2000 moved all callout processing into a dedicated module, but the softclock
scheduling stayed behind in hardclock(9).

We can move all the softclock() "stuff" into the timeout module to make
kern_clock.c a bit cleaner. Neither initclocks() nor hardclock(9) need
to "know" about softclock(). The initial softclock() softintr registration
can be done from timeout_proc_init() and softclock() can be scheduled
from timeout_hardclock_update().

ok visa@


Revision tags: OPENBSD_6_6_BASE
# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.142 03-Jul-2019 cheloha

Add tsleep_nsec(9), msleep_nsec(9), and rwsleep_nsec(9).

Equivalent to their unsuffixed counterparts except that (a) they take
a timeout in terms of nanoseconds, and (b) INFSLP, aka UINT64_MAX (not
zero) indicates that a timeout should not be set.

For now, zero nanoseconds is not a strictly valid invocation: we log a
warning on DIAGNOSTIC kernels if we see such a call. We still sleep
until the next tick in such a case, however. In the future this could
become some sort of poll... TBD.

To facilitate conversions to these interfaces: add inline conversion
functions to sys/time.h for turning your timeout into nanoseconds.

Also do a few easy conversions for warmup and to demonstrate how
further conversions should be done.

Lots of input from mpi@ and ratchov@. Additional input from tedu@,
deraadt@, mortimer@, millert@, and claudio@.

Partly inspired by FreeBSD r247787.

positive feedback from deraadt@, ok mpi@


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.141 23-Apr-2019 visa

Remove file name and line number output from witness(4)

Reduce code clutter by removing the file name and line number output
from witness(4). Typically it is easy enough to locate offending locks
using the stack traces that are shown in lock order conflict reports.
Tricky cases can be tracked using sysctl kern.witness.locktrace=1 .

This patch additionally removes the witness(4) wrapper for mutexes.
Now each mutex implementation has to invoke the WITNESS_*() macros
in order to utilize the checker.

Discussed with and OK dlg@, OK mpi@


Revision tags: OPENBSD_6_4_BASE OPENBSD_6_5_BASE
# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.140 31-May-2018 guenther

Add sleep_finish_all(), which provides the common combo of sleep_finish(),
sleep_finish_timeout(), and sleep_finish_signal() with error preferencing,
and then use it in five places.

ok mpi@


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


Revision tags: OPENBSD_6_3_BASE
# 1.139 20-Mar-2018 mpi

Do not panic from ddb(4) when a lock requirement isn't fulfilled.

Extend the logic already present for panic() to any DDB-related
operation such that if ddb(4) is entered because of a fault or
other trap it is still possible to call 'boot reboot'.

While here stop printing splassert() messages as well, to not fill
the buffer.

ok visa@, deraadt@


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.138 08-Feb-2018 mortimer

Use a temporary chacha instance to fill large randomdata sections. Avoids
grabbing the rnglock repeatedly.

ok deraadt@ djm@


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision


# 1.137 05-Jan-2018 pirofti

Show uvm_fault and trace when typing show panic on a page fault'd kernel

Currently there is only support for amd64, if this change settles
I will add support for the rest of the architectures.

OK kettenis@.


# 1.136 14-Dec-2017 dlg

add code to provide simple wait condition handling.

this will be used to replace the bare sleep_state handling in a
bunch of places, starting with the barriers.


# 1.135 13-Nov-2017 mpi

Do not call splassert_fail() if splassert_ctl is <= 0.

This matches splassert(9)s behavior and prevent noise when a CPU
panic(9) and set splassert_ctl to 0.

Found the hardway by sthen@


# 1.134 10-Nov-2017 mpi

Introduce a reader version of the NET_LOCK().

This will be used to first allow read-only ioctl(2) to be executed while
the softnet taskq is running. Then it will allows us to execute multiple
softnet taskq in parallel.

Tested by Hrvoje Popovski, ok kettenis@, sashan@, visa@, tb@


Revision tags: OPENBSD_6_2_BASE
# 1.133 11-Aug-2017 mpi

Remove NET_LOCK()'s argument.

Tested by Hrvoje Popovski, ok bluhm@


# 1.132 27-Jul-2017 mpi

Stop doing an splsoftnet()/splx() dance inside the NET_LOCK().

This will allow us to not carry a returned value when entering a critical
section.

ok bluhm@, visa@


# 1.131 29-May-2017 tedu

clang has builtin_memmove. ok deraadt


# 1.130 18-May-2017 kettenis

Add copyin32(9) prototype.


# 1.129 15-May-2017 mpi

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@


# 1.128 30-Apr-2017 mpi

Rename Debugger() into db_enter().

Using a name with the 'db_' prefix makes it invisible from the dynamic
profiler.

ok deraadt@, kettenis@, visa@


# 1.127 30-Apr-2017 mpi

Unifdef KGDB.

It doesn't compile und hasn't been working during the last decade.

ok kettenis@, deraadt@


# 1.126 20-Apr-2017 visa

Hook up mplock to witness(4) on amd64 and i386.


Revision tags: OPENBSD_6_1_BASE
# 1.125 17-Mar-2017 mpi

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK(). That means it doesn't buy us anything except a possible
deadlock that we did not spot. So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@


# 1.124 14-Feb-2017 mpi

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@


# 1.123 25-Jan-2017 mpi

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@


# 1.122 24-Jan-2017 kettenis

In preparation of compiling our kernels with -ffreestanding, explicitly map
a few performance-critical functions to compiler builtins. Since the
builtins supported by gcc3, gcc4 and clang are not the same, there are
(unfortunately) some compiler checks to make sure we only do the mapping
for builtins that are actually supported by the compiler.

ok jca@, tom@, guenther@


# 1.121 29-Dec-2016 mpi

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@


# 1.120 19-Dec-2016 mpi

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@


# 1.119 24-Sep-2016 tedu

introduce hashfree() function to free hash tables, with sizes.
ok guenther


# 1.118 17-Sep-2016 jasper

garbage collect dead prototype

ok kettenis@ mpi@


# 1.117 13-Sep-2016 mpi

Introduce rwsleep(9), an equivalent to msleep(9) but for code protected
by a write lock.

ok guenther@, vgross@


# 1.116 04-Sep-2016 mpi

Introduce Dynamic Profiling, a ddb(4) based & gprof compatible kernel
profiling framework.

Code patching is used to enable probes when entering functions. The
probes will call a mcount()-like function to match the behavior of a
GPROF kernel.

Currently only available on amd64 and guarded under DDBPROF. Support
for other archs will follow soon.

A new sysctl knob, ddb.console, need to be set to 1 in securelevel 0
to be able to use this feature.

Inputs and ok guenther@


# 1.115 03-Sep-2016 naddy

Write the system time back to the RTC every 30 minutes.
This fixes the problem that long-running machines which were not
shut down properly would reboot with a badly offset system time.

hints and ok kettenis@


# 1.114 01-Sep-2016 akfaew

MPSAFE is never used, so get rid of it.

OK natano@ mpi@ guenther@


Revision tags: OPENBSD_6_0_BASE
# 1.113 17-May-2016 bluhm

Backout the previous fix for the sendsyslog(2) with LOG_CONS solution.
Permanently holding /dev/console open in the kernel works only until
init(8) calls revoke(2). After that the console device vnode cannot
be used anymore. It still resulted in a hanging init(8) if it tried
to syslog(3) something. With the backout also dmesg -s works again.


# 1.112 10-May-2016 bluhm

If sendsyslog(2) is called with LOG_CONS before syslogd(8) has been
started and before init(8) has opened the console, the kernel could
crash as the console device has not been initialized. Open
/dev/console in the kernel before starting init(8) and keep it open.
This way sendsyslog(2) can be called early in the system.
OK beck@ deraadt@


# 1.111 24-Mar-2016 mpi

Remove unused ``curpriority'' define.

Its description might be confusing, it was the pre-SMP parent of what
is now ``spc_curpriority'' which reflects the ``p_usrpri'' of curproc.


# 1.110 15-Mar-2016 stefan

Remove now unused legacy uiomovei() function.

All its callers got reviewed and converted to
use uiomove() properly.

ok deraadt@


Revision tags: OPENBSD_5_9_BASE
# 1.109 11-Dec-2015 mpi

Replace mountroothook_establish(9) by config_mountroot(9) a narrower API
similar to config_defer(9).

ok mikeb@, deraadt@


Revision tags: OPENBSD_5_8_BASE
# 1.108 11-Jun-2015 mikeb

Move hzto(9) to the attic; OK dlg


Revision tags: OPENBSD_5_7_BASE
# 1.107 10-Feb-2015 miod

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@


# 1.106 10-Dec-2014 mikeb

retire shutdown hooks; ok deraadt, krw


# 1.105 10-Dec-2014 mikeb

Convert watchdog(4) devices to use autoconf(9) framework.

ok deraadt, tests on glxpcib and ok mpi


# 1.104 18-Nov-2014 miod

Add __attribute__((__bounded__)) to arc4random_buf().
ok deraadt@ tedu@


# 1.103 18-Nov-2014 tedu

move arc4random prototype to systm.h. more appropriate for most code
to include that than rdnvar.h. ok deraadt dlg


# 1.102 09-Oct-2014 tedu

remove LKM support


Revision tags: OPENBSD_5_6_BASE
# 1.101 13-Jul-2014 uebayasi

KERNEL_ASSERT_LOCKED(9): Assertion for kernel lock (Rev. 3)

This adds a new assertion macro, KERNEL_ASSERT_LOCKED(), to assert that
kernel_lock is held. In the long process of removing kernel_lock, there will
be a lot (hundreds or thousands) of use of this; virtually almost all functions
in !MP-safe subsystems should have this assertion. Thus this assertion should
have a short, good name.

Not only that "KERNEL_ASSERT_LOCKED" is consistent with other KERNEL_* and
SCHED_ASSERT_LOCKED() macros.

Input from dlg@ guenther@ kettenis@.

OK dlg@ guenther@


Revision tags: OPENBSD_5_4_BASE OPENBSD_5_5_BASE
# 1.100 11-Jun-2013 deraadt

Replace all ovbcopy with memmove; swap the src and dst arguments too
ok otto


# 1.99 24-Apr-2013 matthew

Add tstohz(9) as the timespec analog to tvtohz(9).

ok miod


# 1.98 06-Apr-2013 tedu

shuffle around some poison code, prototypes, values...
allow some more pool debug code to be enabled if not compiled in
bump poison size back up to 64


# 1.97 06-Apr-2013 tedu

rthreads are always enabled. remove the sysctl.
ok deraadt guenther kettenis matthew


# 1.96 28-Mar-2013 tedu

separate memory poisoning code to a new file and make it usable kernel wide
ok deraadt


Revision tags: OPENBSD_5_3_BASE
# 1.95 09-Feb-2013 miod

Add explicit __attribute__ ((__format__(__kprintf__)))) to the functions and
function pointer arguments which are {used as,} wrappers around the kernel
printf function.
No functional change.


# 1.94 17-Oct-2012 deraadt

Swap arguments to wdog_register() since it is nicer, and prepare
wdog_shutdown() for external usage.


# 1.93 26-Sep-2012 brad

Explicitly annotate setjmp() and longjmp() (and friends) as
__returns_twice and __dead instead of depending on GCC's special
handling of these function names.

With input from kettenis@ and guenther@
Fixes a warning from clang
ok matthew@


# 1.92 07-Aug-2012 guenther

Move the common bits of syscall invocation and return handling into
an MI file, <sys/syscall_mi.h>, correcting inconsistencies and the
handling when copyin() of arguments fails.

Tested on i386, amd64, sparc64, and alpha (thanks naddy@)
Any issues with other platforms will be fixed in tree.

header name from millert@; ok miod@


# 1.91 02-Aug-2012 guenther

Apply profiling to all threads instead of just the thread that called
profil() by moving P_PROFIL from proc->p_flag to process->ps_flags with
matching adjustment in fork1() and exit1()

ok matthew@


Revision tags: OPENBSD_5_1_BASE OPENBSD_5_2_BASE
# 1.90 13-Jan-2012 jsing

Switch back to bootduid, however remember to include sys/systm.h...


Revision tags: OPENBSD_5_0_BASE
# 1.89 06-Jul-2011 art

Clean up after P_BIGLOCK removal.
KERNEL_PROC_LOCK -> KERNEL_LOCK
KERNEL_PROC_UNLOCK -> KERNEL_UNLOCK

oga@ ok


# 1.88 26-Apr-2011 jsing

Allow the root device to be identified via its disklabel UID.

ok deraadt@ marco@ krw@


Revision tags: OPENBSD_4_9_BASE
# 1.87 10-Jan-2011 tedu

add a new function, explicit_bzero, to be used for erasing "secret" stuff.
unlike normal bzero, we guarantee that the compiler will not optimize out
calls to this function for otherwise dead variables.
to be adjusted as needed when compilers and linkers get smarter.
ok deraadt miod


# 1.86 21-Sep-2010 matthew

Add assertwaitok(9) to declare code paths that assume they can sleep.
Currently only checks that we're not in an interrupt context, but will
soon check that we're not holding any mutexes either.

Update malloc(9) and pool(9) to use assertwaitok(9) as appropriate.

"i like it" art@, oga@, marco@; "i see no harm" deraadt@; too trivial
for me to bother prying actual oks from people.


# 1.85 07-Sep-2010 deraadt

remove the powerhook code. All architectures now use the ca_activate tree
traversal code to suspend/resume
ok oga kettenis blambert


# 1.84 06-Sep-2010 deraadt

All PWR_{SUSPEND,RESUME} can now be replaced by DVACT_{SUSPEND,RESUME}


# 1.83 27-Aug-2010 deraadt

kill PWR_STANDBY (apm can use PWR_SUSPEND instead). While here, renumber
PWR_{SUSPEND,RESUME} so that they match the values of DAVCT_{SUSPEND,RESUME}
so that we can eventually (many more steps...) kill the powerhook garbage
and use the activate mechanism.
no objections


# 1.82 20-Aug-2010 matthew

Change hzto(9) and tvtohz(9) arguments to const pointers.

ok krw@, "of course" tedu@


Revision tags: OPENBSD_4_8_BASE
# 1.81 08-Jul-2010 deraadt

Devices which don't have read or write functionality should not return
enodev to poll, because this returns an errno of 19 in revents. Oops.
Use seltrue where needed, and use a new selfalse function for those which
don't know if the next op will be non-blocking
Mostly discussed with guenther and miod


# 1.80 29-Jun-2010 tedu

Eliminate RTHREADS kernel option in favor of a sysctl. The actual status
(not done) hasn't changed, but now it's less work to test things.
ok art deraadt


# 1.79 20-Apr-2010 tedu

remove proc.h include from uvm_map.h. This has far reaching effects, as
sysctl.h was reliant on this particular include, and many drivers included
sysctl.h unnecessarily. remove sysctl.h or add proc.h as needed.
ok deraadt


# 1.78 06-Apr-2010 tedu

move some of proc.h's greatest hits to systm.h, speeding up compiles.
lots of build testing by deraadt, ok/feedback deraadt guenther kettenis


Revision tags: OPENBSD_4_7_BASE
# 1.77 04-Nov-2009 kettenis

Get rid of __HAVE_GENERIC_SOFT_INTERRUPTS now that all our platforms support it.

ok jsing@, miod@


Revision tags: OPENBSD_4_6_BASE
# 1.76 19-Apr-2009 deraadt

Count number of cpus found (potentially not attached) and store that
in sysctl hw.ncpufound; ok miod kettenis


Revision tags: OPENBSD_4_5_BASE
# 1.75 06-Nov-2008 deraadt

queue the mountroot hooks to be run in the same order


Revision tags: OPENBSD_4_4_BASE
# 1.74 16-May-2008 thib

merge vfs_opv_init into vfs_op_init and remove the former,
as they where called consecutively in vfs_init.


Revision tags: OPENBSD_4_3_BASE
# 1.73 27-Nov-2007 art

Add possibility to add flags to syscalls in syscalls.master to mark
syscalls as NOLOCK and MPSAFE. The flags have slightly different semantics:
NOLOCK - the syscall doesn't grab any locks whatsoever.
MPSAFE - the syscall deals with its own locking.

What this means in practice is that NOLOCK syscalls can always be done
without the biglock. The MPSAFE syscalls can be done without the biglock
on CPUs that don't handle interrupts that require biglock (to preserve
lock ordering).

deraadt@ ok


Revision tags: OPENBSD_4_2_BASE
# 1.72 01-Jun-2007 deraadt

some architectures called setroot() from cpu_configure(), *way* before some
subsystems were enabled. others used a *md_diskconf -> diskconf() method to
make sure init_main could "do late setroot". Change all architectures to
have diskconf(), use it directly & late. tested by todd and myself on most
architectures, ok miod too


# 1.71 11-May-2007 pedro

Don't use LK_CANRECURSE for the kernel lock, okay miod@ art@


Revision tags: OPENBSD_4_1_BASE
# 1.70 26-Oct-2006 jmc

typos; from bret lambert


Revision tags: OPENBSD_4_0_BASE
# 1.69 27-Apr-2006 tedu

use the underscore variants of _BYTE_ORDER which are always defined
even when various "strict" compiler options are used
ok deraadt millert


Revision tags: OPENBSD_3_9_BASE
# 1.68 22-Feb-2006 miod

Remove unused _{ins,rem}que functions - they were not even implemented on
all architectures.


# 1.67 14-Dec-2005 millert

convert _FOO_SOURCE -> __FOO_VISIBLE in machine. OK deraadt@


Revision tags: OPENBSD_3_7_BASE OPENBSD_3_8_BASE
# 1.66 14-Jan-2005 djm

bounds checking for copystr, copyin and copyinstr;
tested by moritz@ otto@ deraadt@, ok deraadt@


# 1.65 28-Nov-2004 deraadt

mountroothooks are called after the root filesystem is mounted.


# 1.64 16-Sep-2004 grange

We don't have vsprintf/sprintf in the kernel anymore, spotted
by form@pdp-11.org.ru.

ok millert@ deraadt@


Revision tags: OPENBSD_3_6_BASE
# 1.63 20-Jun-2004 itojun

boundary-check memcpy and friends. henning ok


# 1.62 13-Jun-2004 niklas

debranch SMP, have fun


Revision tags: SMP_SYNC_A SMP_SYNC_B
# 1.61 08-Jun-2004 marc

pull ncpus support from smp tree into main branch.
remove alpha specific definition of ncpus.
OK (and tested on alpha) deraadt@


Revision tags: OPENBSD_3_5_BASE
# 1.60 05-Jan-2004 espie

unobfuscate systm.h: use va_list for vprintf.
_BSD_VA_LIST_ explained by millert@, okay drahn@


# 1.59 03-Jan-2004 espie

put an mi wrapper around stdarg.h/varargs.h. gcc3 moved stdarg/varargs macros
to built-ins, so eventually we will have one version of these files.
Special adjustments for the kernel to cope: machine/stdarg.h -> sys/stdarg.h
and machine/ansi.h needs to have a _BSD_VA_LIST_ for syslog* prototypes.
okay millert@, drahn@, miod@.


Revision tags: OPENBSD_3_4_BASE
# 1.58 24-Aug-2003 avsm

sprinkle some __kprintf__ attributes around functions which use format
strings in the kernel to make gcc aware of the extra modifiers
deraadt@ ok


# 1.57 21-Jul-2003 tedu

remove caddr_t casts. it's just silly to cast something when the function
takes a void *. convert uiomove to take a void * as well. ok deraadt@


# 1.56 02-Jun-2003 millert

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999. Proofed by myself and Theo.


# 1.55 21-May-2003 art

Match vprintf prototype to userland and standards.

deraadt@ ok


Revision tags: OPENBSD_3_3_BASE UBC_SYNC_A
# 1.54 21-Jan-2003 markus

add kern.watchdog sysctl and generic watchdog interface;
based on feedback and discussions with mickey, henric, fgsch and jakob.
ok art@, mickey@, jakob@, henric@


# 1.53 09-Jan-2003 miod

Remove fetch(9) and store(9) functions from the kernel, and replace the few
remaining instances of them with appropriate copy(9) usage.

ok art@, tested on all arches unless my memory is non-ECC


Revision tags: OPENBSD_3_2_BASE UBC_SYNC_B
# 1.52 12-Jul-2002 art

- Add a flags argument to dohooks.
The flag can be either HOOK_REMOVE or HOOK_REMOVE|HOOK_FREE.
o HOOK_REMOVE removes the hook from the list before executing it.
o HOOK_FREE frees the hook after that.

- Let dostartuphooks use HOOK_REMOVE|HOOK_FREE so we can reclaim the memory.

- Let doshutdownhooks use HOOK_REMOVE so that when some shutdown hook
panics (they do that all the #@$%! time these days) we don't loop
for ever. Don't HOOK_FREE, it doesn't matter and I don't want to add
another possible panic condition for shutdown hooks.

- Actually free the pointer we're throwing away in hook_disestablish (I wonder
how much memory this has leaked over the years).


# 1.51 06-Jul-2002 nordin

Remove kernel support for NTP. ok deraadt@ and tholo@


# 1.50 15-May-2002 art

Implement splassert() for sparc - a tool for finding problems related to
spl handling (already found 3 problems).

Man page in a few seconds.
deraadt@ ok.


Revision tags: OPENBSD_3_1_BASE
# 1.49 14-Mar-2002 mickey

remove ambiguity in version,ostype,osversion,osrelease and their constanity, they are and declarre 'em accordingly also removing private externies of those


# 1.48 14-Mar-2002 millert

Final __P removal plus some cosmetic fixups


# 1.47 14-Mar-2002 millert

First round of __P removal in sys


# 1.46 15-Feb-2002 art

Add a tvtohz function. Like hzto, but doesn't subtract the current time.


# 1.45 04-Feb-2002 miod

Cleanup mountroot-related definitions.


Revision tags: UBC_BASE
# 1.44 06-Nov-2001 art

branches: 1.44.2;
Let fork1, uvm_fork, and cpu_fork take a function/argument pair as argument,
instead of doing fork1, cpu_set_kpc. This lets us retire cpu_set_kpc and
avoid a multiprocessor race.

This commit breaks vax because it doesn't look like any other arch, someone
working on vax might want to look at this and try to adapt the code to be
more like the rest of the world.

Idea and uvm parts from NetBSD.


Revision tags: OPENBSD_3_0_BASE
# 1.43 26-Aug-2001 deraadt

be and le varients of syscallarg; from netbsd


# 1.42 23-Aug-2001 miod

Remove the old timeout legacy code.


# 1.41 27-Jul-2001 niklas

Startup hooks. Can be used for providing root/swap devices from device
systems which want configuration to finish late, like I2O. Implemented via
a general hooks mechanism which the shutdown hooks have been converted to
use as well. It even has manpages!


# 1.40 27-Jun-2001 art

kill old vm


# 1.39 24-Jun-2001 mickey

place extern cold here; per discussion w/ art@


# 1.38 05-May-2001 art

Rename configure() to cpu_configure().
Move it from cpu_startup() to main().


Revision tags: OPENBSD_2_7_BASE OPENBSD_2_8_BASE OPENBSD_2_9_BASE SMP_BASE
# 1.37 02-Jan-2000 assar

branches: 1.37.2;
(sy_call_t): define a type for the functions in sysent
PR 1032


Revision tags: kame_19991208
# 1.36 02-Dec-1999 deraadt

snprintf in kernel; assar@stacken.kth.se


# 1.35 12-Nov-1999 angelos

This shouldn't have been committed with the previous commit, revert
(experimental code)


# 1.34 12-Nov-1999 angelos

Merge dvdio.h and cdio.h, don't use typedefs, get rid of bitfields (no
good reason to use them, not packed structures anyway).


# 1.33 07-Nov-1999 provos

add APM powerhooks.
from NetBSD, Sat Jun 26 08:25:25 1999 UTC by augustss:

Add powerhooks, i.e., the ability to register a function that will be
called when the machine does a suspend or resume.
XXX Will go away when Jason's kevents come to life.


Revision tags: OPENBSD_2_6_BASE
# 1.32 12-Sep-1999 weingart

Fix rootdev handling, use disk checksums to find the device we were booted
from. Hopefully this will fix all the hangs/panics where the root device
was not found.


# 1.31 21-Jul-1999 deraadt

proto mem*() functions


# 1.30 20-May-1999 aaron

fix some typos; kwesterback@home.com


# 1.29 06-May-1999 mickey

add scdebug_{call,ret} to help SYSCALL_DEBUG compile.
remove nsysent extern declaration, since it's no longer defined anywhere,
and SYS_MAXSYSCALL is used everywhere instead.
niklas@ -- ok


# 1.28 28-Apr-1999 art

zap the newhashinit hack.
Add an extra flag to hashinit telling if it should wait in malloc.
update all calls to hashinit.


Revision tags: OPENBSD_2_5_BASE
# 1.27 26-Feb-1999 art

uvm doesn't use nswdev or nswmap.
add prototype for kcopy (uvm)


# 1.26 26-Feb-1999 millert

Add newhashinit(), which is identical to hashinit() except it takes a flags
arg for passing to malloc() (hashinit always uses M_WAITOK which is not
always what you want). Everything that uses hashinit should really
get converted to newhashinit and then newhashinit can be renamed.


# 1.25 10-Jan-1999 niklas

Generalize cpu_set_kpc to take any kind of arg; mostly from NetBSD


Revision tags: OPENBSD_2_3_BASE OPENBSD_2_4_BASE
# 1.24 06-Nov-1997 csapuntz

Updates for VFS Lite 2 + soft update.


# 1.23 04-Nov-1997 chuck

add prototype for vprintf


Revision tags: OPENBSD_2_2_BASE
# 1.22 06-Oct-1997 deraadt

back out vfs lite2 till after 2.2


# 1.21 06-Oct-1997 csapuntz

VFS Lite2 Changes


Revision tags: OPENBSD_2_1_BASE
# 1.20 06-Mar-1997 tholo

Prototype hardpps() if PPS_SYNC option is present


# 1.19 18-Jan-1997 mickey

protect from multiple includes (required by gpl_math_emulate)


# 1.18 14-Jan-1997 kstailey

Debugger() is needed by KGDB not just DDB


# 1.17 08-Dec-1996 niklas

-Wcast-qual happiness


# 1.16 29-Nov-1996 kstailey

back out bitmask_snprintf()


# 1.15 24-Nov-1996 niklas

Added bitmap_snprintf proto


# 1.14 11-Nov-1996 mickey

export vfs_opv_init*


# 1.13 06-Nov-1996 deraadt

proto mountroot and friends


# 1.12 29-Oct-1996 mickey

-Wall happiness, especially for sparc/stand


# 1.11 29-Oct-1996 mickey

-Wall happiness (especially for sparc)


# 1.10 19-Oct-1996 niklas

__assert added, impl from netbsd, however put elsewhere. use it instead
of private versions (one even using the userland header) in if_sn.c


Revision tags: OPENBSD_2_0_BASE
# 1.9 15-Aug-1996 niklas

-Wall, -Wstrict-prototypes and some KNF cleanup


# 1.8 23-Jul-1996 deraadt

make printf/addlog return 0, for compat to userland


# 1.7 02-Jul-1996 niklas

-Wall & -Wstrict-prototype fixes


# 1.6 09-Jun-1996 briggs

Add prototype for hardupdate() ifdef NTP.


# 1.5 02-May-1996 deraadt

proto more stuff


# 1.4 21-Apr-1996 deraadt

partial sync with netbsd 960418, more to come


# 1.3 18-Apr-1996 niklas

Merge of NetBSD 960317


# 1.2 29-Feb-1996 niklas

From NetBSD: Merge with NetBSD 960217


# 1.1 18-Oct-1995 deraadt

branches: 1.1.1;
Initial revision